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ABSTRACT • , • ' . 

^ An "undesigned'lJexperamient is one in which the 
predictor varialbles are cbr^related^" either due to a failu;ce to . ' 
complete a design dr because th^e investigator vas unable to ^select or 
control relevant experimental conditions/ The trad±tional method, of 
analyzing this class of 'experiment—multiple regression analysis 
based on a least squares criterion — gives ri^e to a number of 

'interpretation problems when the effects of individual predictors, ar^ 
to be assessed* Some difficulties dnd their effects on the quality of • 

' inf ormai6ion are discussed. Two methods are described in this report 
for improving the information obtained from the /undesigned human 
facrtors experiment. One is]%o collect more inf6rma±ipn at a few data 

' poiWts ^select'ed at locations , Ijhat improve the brthogonality^of this 
iicn--orthbgonal design. The other is .to use a ridge regression ^ , 

analysis in place of the conventional le|st sguar'es analysis^ in^^ 

>hich a slight bias'±3 introduced into tj^e data in si^ch a way that> 

jthe combined bias and variance error is^maller than the variaCncjs 

"error of unbiased estimates from the ueast squares analysis. The r 
ridge analysis^ produces more stable and meaningful regression • ^ 

'Coefficients. Computa'tional aids— both refei?fences* and complete 
computer programs — are supplied. (Au'frhor) ' 
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' ABSTRACT , 4,* 

. ' ' ■ , ■ V • . ; , ■ 

An "\inde signed" 'experiment is one in which the 'predictdr variables 
are correlated, .either'd\ie to a- failure to complete a*^ sign orlDecause the 
investigator was unable to select or tontrpl relevp-nt experimental conditions. 
The> traditional TriQthpd of analyzing this class of experiment 7- multiple 
'Ve>gres8ion analysis, based on a laeast squares criterion give& xise to an\lm- 
ber of interpretation problems when the ef:fects of individual predictors dife to 
be assessed. Some difficulties and their effects on: tli6 (quality of iriforination , 
are Siscussed. - \ ^ ' ' - • , 

"^wo methods are described in^ 'this ' repprt for improvjlng the informa- 
'tk)h obtained from the undesigned human facto?:s experiment. One is to collect 
more information at a few da J:a points" selected\at locations that improve the 
orthogonality of this non-orthpgonal design. The other is to use a ridge 
regression analysis 'in place of the conventioi^al least squares analysis* in 
which a slight/bias is introduced into the data in such a way that the combined- 
bias and variance erro.r Ls smaller than tfie variance error of the unbiased 
estimates from the least squares atialysis. T?he ridge analysis produces more 
stable and meaniijgful regre^ssion coefficients. Computational aids both^. 
references and complete computer programs >- are 'supplied. v 
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Whfile tli<^' "undesigned" expjerirr^nf is' used extensively in personnel , 
selection research, it has been virtually' ignored as a viable approach in 
equipment design and training research. Traditionally, in these, latter prob- 
lem areas, systematic designs have been used in which the primary' experi- 
mental variables are all controlled. ^ As a result, vat'iables that ^re difficult 
or impossible to contrbLare often excluded from the e^^perimental plan even 
when they are* relevant and have ah important effedrt on performance.- Conse- 
quently,' much of the performance vaTiability in the experiment remains ^ 
unexplained an9 the data is of limited value when applied to real-world 
probleins, ^ / ' * 

Unmanageable sources of variance^ however, 'can be accounted for if 
-they are treated. as variables pf an "undesigned" experiment, Th^is a most 
. effeftiVe use of^the methods described in this report to enhance "undesigned" 
experiifients ^s 'to combing them with t'he "advanced methodologies" described 
in previous reports (e^'^. *, ^'EcoriomicaT Multifalctor Designs"^and "Methods * 
of Handling^Sequence Effects, , . " (Simon, 1973*; 1974)). By properly using 
these methods iti combination, \^e become capable bf doing experiments that 
will account for most of the varj.ance associated with the performance of a 
real-world task and to eliminate major sources of irrelevant variance 

I woulci be interested in hearing about applications ofjbhese techniques 
by behavioral scientists and am*willing to discuss efforts in thi§^regard. 
^ Comments and criticises are always welcomed. 



-.Charles W. Simon 
1975 ' 
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SYMBOLOGY 



Inver,sion of matrix A where.A '= X'X ' * - ' ^ ' 

Regression coefficient\of variable i for raw scare data 

.True beta (standard regression) coefficient. 

* Estimated, beta (standard regression) coefficient from least ^ 
squares anstlysis , * • * • X . 

Biased estimator of ridge coefficient from ri^ge regression analysis' 

Statistical expectation; weighted 'integral of ' - ' 

Identity matrix:* In a, cprrelation matrix, all' diagonal valued equal 
one a'n^i off- diagonal* values equal zero . i 

Constant ^used t^distdrt correlation matrix in ridge regression 
analysis ' ' u ^ ^ " ' 

Eigenvalue . • o • • ♦ ' 

S<Juar6d distance betwe^n.true and estimated coefficients A 

Product of ^ ^ " ^ ' ^. 

Residua^ erf or variance; sigma^quared, 'meari square error 

'Surd of (jfrom 1 to p. items) ' / 

Variance *of. an estimated response . ^ • ' a , 

■ . . - ^ ... , .. ■ ^ ^ 

TranspWe of . vector x (experimental- condition) . 

Predictor variable i t ' • f 

Sum of squares and cross'^product matrix (of val^e of ifevery predictor 
iptultiplied by one another yielding and XiXj Values); x vector 
multiplied by its transpose; square root of "elements of X^X divided 
by N equals correlation matrix ' . . ' 

... i ^ , ... .■ ' • . 

Determinant of the (X'X) matrix 



Cross-product between predictor' and/performance 
Estimated 'performance (from regression equation) 
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SECTION I 
II^^TRODUCTION 



* This report describes two methods of improving the information obtained . 
from the "undesigned" experiment. In the first approach, additional data* is - 
collected in order'to facilitate the int^rpretatiSn of data'alreffeidy collected. 
The second approach is a relatively hew technique of data analysis that pro- 
vides better, solutions than jioes the^traditional least squares analysi's, 

' \ ■ ' ' ■ 

DEFlfaTlON ' ■ . - . ' ' ..• - . • ' , 

' "undesigned" experiment is pne in-which some experimental variables 
cannot.be ,pr are not ccmtrolled by the experimjenter. To be included in an ^ 
experiment', therefore,; t% level of each variable must be known or measured- 
at the time each perf^riifi^ce measurement is. made. Under tbejfee circun>- 
staiTiX^es, variables in an unde signed' exfJeHm^ are correlated mathematically 
to some de^gree, a fcphditibn' which m3.rkedly .complicates the interpretation of 
.the* re.sults-. ^. . ^•^lfc*'v^ * - " " 

EXA MP L ES Oy UNiPESteN E D : EXP E RIMENTS ^ ; ' ■ 

The following fictitious situations are examples of undesigned experiments 

in human factor^4 engineei(mg. ri^search: , 
' ■ 1 " . ^ - ^ . . • 

L The. Army has rewritten its maintenance manual,a in a style that will 

• ^ enable the ordinary technician to understand and use the info^rnation 
better. ^ They are interested-in measuring the impact of this Revision' 
on system performance; Old and- new manuals are made availi|ble at 
^ a number of maintenance depots where the technicians differ in train- 
ing levels and experience with, the particular equipinent. At the 
0 depots, differences also exist in the availability of critical partp, the 
maintenance philosophy and schedules,, the unit morale l^it^efS, and 
other- factors that could conceivably affect the quality of%iaintenance. 
Since it is impoasi]ple tlo control these associated factors to any 
degree, Ta daily record^is kept on, each of them along with several 



criteria ol maintenance performance over.a six-morith^period. This ^ 
data taken ag^^ whole can be treated as an undesigned experiment. 

The Air Fore© wishes to determine. the optirAum parameters for the 

riianual control configuifation of a mipsile-delivery system. They 

wish to reach a solT3R:ion derived from e'mpirical data collected under 

Operational conditions. A flight' test is |)laiined in which the strike 

accuracy of a dummy air-to-jground missile is to be studied as a * 

function of changes in cotitrol parameters. ^ There is little op|xortunity 

to rnake a great many flights. to offset the effects of 'such uncontrolled 

but critical factors as^ visibili^ty, tiJ^bulence, and variations in the / 

target itself. . However, these variables can be measured a4^tjsfe time 

each missile is fired. While the control parameteirs can be systtf- 

matically varie^, vfche^existence .of the other uncontrolled but pre- 

\ • * • • / 

sumably critical factors make this a partiaUy uii,desigAed experiment. 

The Navy has built a research-oriented pilot-training simulator. A 
study \3 conducted to determine the least expensive aimulator con- 
figuration that will result in the greatest transferrin pilot perform- 
ainc^from simulator to aircraft. T^o groups of pilots are selected 
for the study those with less than 2000 flying hours and those with 
more than 5000 flying ->]^ours. It is recognized that flying time per se 
is not sufficient to characterize pilot skill and that such things as the 
type of aircraft, the nature of the flying experience (military or 
civilian; War-time or peace-time), and recency of this experience 
also should be taken into consideration. Since it is necessary to use 

.all available pilots as subjects without an opportunity to control these 

• . ■ ' - 

other factoTs,'^ite't ^characteristics must be included in the analysis 

'* ' 

and handled as ^a^iablel^^bf an undesigned experiment. 

Over a twelve year ^r4.Jpl a research organization has conducted 
experiments ^relating equipment parameters to success in acquiring 
ground tgirgetg-on an airborne display. During this time thS effects 
of over fifteen variables associated with the sen'sor, the display, 
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anc^^e bribing information Have Veen examined, but in a series of^-. 
^ small expemments of two-and threfe variables ea<:h. Since i^o over- 
all research strategy was^ev^V^planned, the frequency witH >X^hich 
.certain variables and levels^ of variables occur in this dataiya:ries 
considerably. -The resultifeg lack of a balanced design leaved pre- 
dictor variables cpr^lated. Thus this belated effort to combine the 
results of several experiments to de-O^elop a single prediction equa- 
tion takes on the^ ch^ract^ris;tics and problems of an undesigned 
^ ' experiment. ^ . ^ - / 

5; The levelp of a factorial desi'^^'are^nsed as'the data collection, plan 
in a drug-therapy experiment.' While -the study is being run, it 
becomes apparent that two of the extreme conjiitions* cannot be 

* ' . ■ J.-^:s • > 

measured sLt all because they exceed physiological safety limits. 
- . - ^ 

* This destroys the orthogonality of the design. The data that remains 

'to be analyzed takes on the characteristics of an unde^^gned • , 

exper indent. 

- DESIGNED VERSUS ^UNDESIGNED EXPERIMENTS , _ 

The goals of a good experiment should be to obtain new, relevant, 
important, and lasting information which is capab?le of explaining "rnost of^the 
performance variability asaociated with a lpa.rticular real^world task. In t^he 
behavioral sciences, unlike the physical sciences, performance canncjt be 
examined or evaluated independently of the context in which it occurs and can. 
only be des^a^ibed or predicted as a function of this context. The more ^ 
gene*rairzat)le data therefore will be derived from tfxperimeiats in which 
critical context factors are varied rather than held constant. 

c I/, however, an investigator decides- to study behavior in a realistic 

context, ^e may find himself 'in circumstances where his ability to control 
and adjust the levels of critical parameters is sorely limited. This m^ans 

■ that he caii no longer pl^n and carry out a totally designed experiment aijd 
must either limit the questions he can ask or resort to another approach. 
The undesigned experiment -- alone or in conjunction with a balanced design 
offers a viable alternative. * • . 
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CharacteriSAics of Designed ifcxperiments . . ' 

• ■ * • The value of a de signed experiment ^rests on the fac^ that 'the experim/ental 

conditions are selected in such a way that critical effectifs. can be i^lated and 
^the interpretation of the-results simplified. However, there is a price to be 
opaid for these advantages, Ifof the rigidity of the design forces t^e experimeilter 
to:' . ' ^ ■ . . • ■ ^ . ■ . • . 

' • anticipate in advance the variables he will include in his study; 

^ • be.able'to control the e?cact levels of any variables that will be-"^ 
^ ' - includq^d in t^e study;. ^ ^ ' " . . ^ x 

include conditions that.rriay be unrealistic or otherwise undesirable. 



,Positi3jle Features of U ndesignecf Experiments . * • 

^ ' : . . ■ • ■ . . • ■' • • , _ A ' 

The undesigned experiment, because it generally accepts ^as the experi- , 
mental conditions those which exist* ^t the moment a performance measure- 
ment is made, does not face the same problems': THe very lack of control of 
the conditions under which performance data must be acquired yields the ^ 
following advantages for the undesigned .exp'erimentr* . \ 

1. The costs of collecting perforifnance data are no Iqnger as rigidly - 
related to the num'^er of factors being investigated. As many^ vari- 
ambles as desired can be conside^red as long as the level of each can 
be ascertained at the time performance data is being collected, 

2. It is'not always necessary to anticipate critical variables in advance 
of the data collection phase. If appropriate records are available, 

r |Jiese may be used later to introduce more variables into the analysis. 



J 



It is by definition that these advantages fall to the undesigned rather than th'^ 
designed experiment. Obviously a number of these advantages could exist 
for experiments that are planned by an experimenter who int^ends to use \ 
some'analysis of covariance desigrf. However, to identify the class o^ prob- 
lems that will be^fit from tlie techniques discussed in thi« report, any situ- 
ation in which variables are includ^ in which the, level selection is not'un>der 
the investigator* s complete contro^^Ni^ considered an undesigned experiment. 
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Particularly in field studies, there is ak greater likelihood t 
majority critical variables wjU be present (although; not necessa^rily 
identified) and that the-value^s that are used wj^ll J)ec«m9re representa- 
tive of re^ijty. ^ * • ^ ^ . 

4. ^The regression approach permits an iteration in the search for the 
^^more critieaM variables. If the proportion of variance accounted for 
.by the ,r;^r6^ssion is low, other variables may be. tried tb' see if they^ 
fit the data better, provided4he necessary measures are Available.. 

, Thus the undesigne^i^xperiment has the advantage of allowing (or forcing) 
the experimenter to^^tudy thje ^orld as- it reali^ is-, '^f the- levels of eicperi- 
mental variables are/rrot selected artificially but are allowed to var,^ naturally, 

*the chances are higher that performance^ will be measured under more repre- 
sentative circumstances with the releva'nt'and critical variables operating. 

; ■ \ ^ / ^ - 

Measuremerit Sources . The only, alternative to controlUng^th^jp levels of 
variables to be included in an 'experiment is to measure their levels a^*they 
exist at the time performance is measured. The following are the most 
» .common ^ources from which these measurements can be 'obtained: 

• Con<:urrent measurements . As an event unfolds and performance is 
measured, concomittant variables of importance are also measured. 

. (Example"^: Measuring air turbulence in a flight test. ) 

^ p- ■ • - . - • 

• Hist^t'ical meajsuremer^ts . The data is obtained from past records 

— ^ that can in some wa^be associated with the conditions occurring at 

the time performance is being measured. (Example: Using subject 

^ ' '* ' ■ 

ap-titude scores from testes tiaken prior to hi'§ entering the pilot ^ 

^ ' ^ V^* , L. 

, training course. ) , ' ' 

• Incomplete measurements . The levels, of each variable are already 

knowri^, having been assigned ^s i^els of a designed experiment ^ 

which became degraded when certain conditions were omitted by 

' ""choice or by. accident, (Example: A factorial design is planned and • 

' ' ■ - ■ <^ . , • ^ 

data is collected at all but two corner points when a data recorder 

failed to operate. ) • • - 
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In .practice, all or some of these sources may be us'ecf in a single 



difficult to isolate. 



experiment. . ^ ; ' \^ 

Difficulties with Undesigned E3cperiments * - ' ^ 

\ 

.There are penalties, however, associated with the* freedom o£ data 
collection fdr the undesigned experiment: /he imbalance among combinations y 
of variables that is bpund to occur when no systematic -experimental design^ 
is used lea\^es predictor variables c^or related. ^ As a result, 'the derived equa-^ ^ 
tians are^ subject to greater error and- information becomes scrambled and 




In the n^xt section, some problems of intei^preting the results of the*' 
igned experiment are^described along with general concepts and terminology 
in^reg^ession analysis that are' useful w,hen reading this report. 
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' SECTION It . • T ^ 

* ' /. ANALYSIS AND INTERPRETATION PROBLEMS , 
CQNCEPT.S, AND TERMINQLOGt 

In this, section, basic concept^ and terminology Velevarit to multiple 
■ ■ * , "i ' . ' ... 

r^egression analysis will be reviewed and problems in interpretingthe res-yite 

,^ from, undesigned expei:iment9 identified. The discussdon is simplistic and > 
^J^tended only ^o supply the minimum detail required for a. reader to appreci- 
ate the j;alue oCijthe alternate techniques dei^cribed in subsequeM^sB^tions., 
^For ah in depth expla'nation of multiple regression analysis, the reader rs^' 
lencouraged to ii'ead the excellent books and pape^ that are available on this 

jtopic (e: g. , Draper and Smith, 1968"; Darlington, 1968; Kerlinger and 

• . ^ ^■ 

>edhazur, 1973). . 



AW DATA .MATRICES . 



The experimental conditions and related performance in both designed 
.and undesigned experiments can jDe organized into a matrix forJigkt such as 



shown in table T. 1 . 



Observation 



Subj 



Pr edictdr^ Va r iable s^ Xj^) 



XI 



X2 



X3 



X4 



X5 



XN 



Obtained (Y) 
I^erformance 



1 
2 

3 « 

4 ^ 

' ] 

6 , / 

r - 



1 

14 

li 

11 

,4 
5. 



0. 23 
i 

0. 11 

0. 54 
"O. 27 
0. 33 
0..1^ 



1 
1 
2 
1 
2 
1 



25. 6 

14. ,1: 

3. a 

22. 9 

15, 6 



0 
0 
1 
0 
0 
0 



Hi 
Med 
Med 
Hi 
Lo 
Med 
Hi 



1 
5 
3 
4 
1 
2 
5 



4C 



27% 

87% ' 
52% ■ 
/ l5% .. 
38% 
77% 



Each line represents one ob'^ervation, i.e., the conditions, X., ynder , 
which performance was measured and the performance, Y, that. was'obtained. 
There^could of courle be m^re than o^e performance measure for an^obser- 
vation", e, g. ,^speed and a^racy, and, subject- charaete|-istics could be 
Included as^ factors among the predictor variables. 



A primary difference between raw data matrices of dis^gned and tinde^ 
signed experimehts lie& in tlje arrangement of the levels of the predictor 
>^ariables. the designed e^^periment, these levels, "be ini systematically 
controlled by the experimenter, are, geherally sel^ted in I balanced fashion 
>o^ that th0 maip effects, of the m^dlctor variable? ar^ orthigonal (i. e'. , ' 
uncorrelatedj. The factorial dlsign'is one of the moi^e famliar examples in 
•which ^he le4ls of each variable are combined equally oftenjwith eve ry 'other ' 
variable to, achieve this orthogonality.^ As a result, \he analysis-arid inter- 
■ pretation of th^; results ar-e simplified" In the undes\gned experiment, th-is 
balancfe js not achieved because the, experimenter is una'jble (or fails to) 
sele^t|r contjbl the levels of the e3£f,primental conditions: A's a result^ main . 
effects^-c^f predictor; ya.riables a r,e^ correlated with bne another, a condition 
that makes the' analysis and intejfc'ltation of the results more difficult. -This 
/orrelatic^n is a mathematical d^endenCe, a happenstanee of the fevels that 
^curre4at the time the measures were ta>e<^nd does not neceslarilf '■■ 

imply a caudal relationship between the pair of variables 

' f . . • - „ -■ V. . 

CORRELATION MATRICES ' ■ . ' i ' ^ ' ' 

■ - '. . ■ ■ ' / 

' - The distinction between k, designed and undesigned.eLerimenfe is easier " 
to illustrate ifihe raw data matrix- is. transformed into,a correlation matrix 
composed of the linear. (Pearson product moment) correlations among all ' 
variables'. 



Con^?entional reg^ssion analysis handles oAly a single performance^vliriable 
per analysp^-Kerlinger and Pedhazur (1973, 376-381) describe a method of 
doing multivariate regression analysis with two 'dependent variables at the 
same time, illuatrating how the combined analysis provides a clearer inter- 
?Iriabl°'' °^ ^^^^ afj^lyses each with single and different dependent 



" For a designed experiment, a fictitious correlation matrix <or three 
predictor variables and a performance varia^ble >might look like table T. 2. 



;Linear 

Correlations •'^^^j....^^^ 


jpredictor Variables- 
^1 , X2 X3 


• 

Performance 






0 0. 


0.342 1 


0 

Predictor . 
Variables 


X2 


>^ 1.^ ■ — -0^^ 


-0.'l6l . 




X3 


0 ■ ^.0 • 1. » • 


^^^0. 523 • 



[T.21 



The tal|le of intercorrelations can ^fe broken into^wo parts: one, -thef pr^--^ ' 
dictor matrix of correlations among each pr^edictor variable and every 
predictor variable including itaelf, and tVo, the performance columh vector 
of .correlations between each^ipredictor variable ^and performanc^. 

. Note that since each predictor variable correlates perfectly anci positively 
with itself, the diagonal values are all one. ' Note further ^hat with the 
designed experiment, all off-diagpnal values are zdro^ showing>^hat ^he linear 
pomponentg of the predictor's are all orthogonal to one another. A matrix with 
only zeros off the diagonal- is referred to as a diagonal matrix . When^the 
numbers on the diagonal are^all ones, the matrix is called a unit matrix . 



"^V' In the undesigned experiment, jbhe intercorrelation matrix for the predictor 
varia^bles is not likely to have zero correlations in the off-diagonal positions. 
Instead, for the undesigned ^experiment, |:h^ correlation table 'might look like 
table T-r3. ' , ' ' ' 



Linear 

C or r e lations ^''^st^-^;^..^^ 


Predictor Variables 

' XI X2 J X3 


Performance 

■ (Y) - 




XI ^ 


-v4?o?J-^ 


0. 145 


•0. 352. 


0. 674 


Predictor^ 
Variables 


X2 


0. 14^^>v^ 


1. oof 


\ 

^--.^2^2 


0. 532 
0. 348 




X3 


; 0, 3 52 




l.\)0 








9' 
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When the off-diagonal .elements are non-zero, the predictor variables are 
correlated.^ In that case, the matrix is 8ai(| to t^e ill-conditioned and the ' ' 
origin'al experimental design said to be non-orthoponaL • ' ' 

"Note (that in both tables T., 2 and T.3, the predictor matrix is symmeti*ical 
' about the diagonal. In some texts, only half of the inatrix (above or belov^ the 
•-diagonal) will be^ written ojat. \^ " 

To be able to analyze this data by regression analysis^ the matrix must * 
non-singular . This means that each row (c^ columa) of !he matrix must t^e 
linearly independent of^very other row- (or column). IMo row ^r column) is 
> produced from any linear combination of others in the matrix. 

f, ■ ■ . . > . ■ • ■ 

- . MULTIPLE REGRESSION ANALYSig' , • " 

- — ■ — — ' — - ■ ■ r ■ ■ 

- '■ „ ■ " ■ ^ : - 

Given the information in raw data^ or correlational form, the investigator 

ordinarily subjects it to an analysis /that reduces It to a linear pol^no^ial 

equation that will provide the "best'^ estimate of performance under specific 

conditions of the predictor \|ariables. 



'Ea!|h line of th6 raw^data or^correlati'on matrix represents an equation. 
Perfo filling' a multiple regression analysis on the data is the same as finding 
phe common'^solutioii to a set of simultaneous equations. - 

The equajtion derived from an analysis of the raw data will be written 
in the following form: , - ^ . 

3 - ■ . ■ 

• • • • " 

^0^0 + h^l+^2^2+--->N^N = ^ [E.l] 



where I^q^q is a constant and bj^ (i = 1 through are regression coefficien|s , 
for the N independent variables, (i = 1 through J^T), respectively. In 
practice, the X. termcS can represent main effects or transgenerations ol 
n^%in effects, such as cross-products (X.X^) or higher order terms (X^ ), 
each treated in the analysis as if it were another variable. A regression 
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coeffiqient, b., is the average change in Jjerforiixiance that will occur for 45 
each unit change in the particular variable; this change may be positive or 
negative. The value, Y, is the perforrrmnce; estimated by the iequati^ 
particular values of the predictcfr variables^. X.^ 



^Least^Squares Fit . v ; 

The coefficients derived by multiple regression analysis are the ones 
used in the polynomial to provide the "best" fit of the data. The criterion for 
a "best" fit is. met when the sum of the squares of tke. diffe rences between tjie 
observed and the estimated performgnce valued ts at a The dif- 

ference between the observed and the-estlma*ted peiforina^lpe yalues is called 
the residual; thus the "best" fit is obtained from th/e' equation 'that minimizes 
the residual sum of squares (RSS^. ' ' ; 



Standard Regression Equatioa 




riables are commonly measured in different units and on different 
Ees. In order to compare the coefficient's of tTiese" varial;>les, the values 
in the raw data tabid can be converted to standard measures , c/r Z scores. 
This ifr done for eaclj variable as follows: 



X - M 



:= Z 



where X is the raw score to be ^converted, M is the va]|i3.ble mean value, 
and cr is the standa-rd deviation. If ?hes'e standard spores a^ subjected. to a 
multiple regression analysiV then the resulting polyhottiial is referred to as 
a standard regression fequatioii" ^ the following foj-m:' " ;v\ ' 



'^2 



2 2 + '^3 



Z3 + 



N N 



[E.Z] 



There are other criteria fof ju|iging the merits of an (equation. Kiefer (19S9) 
discussesW number of these in detail. Late'r on in'this paper, some weak- 
nesses of a least squares solution of data obtained from non-orthogonal 
experimental designs will be discussed and alterjiative criteria proposed. 
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wherein the'^regression coefficients, "bj, of Equation E.l are replaced by ieta 
' coefficients, p ., and there is no 4onge'r a constant term. A ^regression anarly- 
v sis of the data in a correlation roatri^- results in a standard regressiojtl 

equation. . . , 

1 " , » . ■ . ' ' . - , 0 ' ^J .•• 

1 If either the ordinary regression equation or 4;he standard regression^has 
,l/een calculated, th^ other can be derived accordii!^ to the following / 
. relationships: " « ^ ^ ' „ * 

V-" • b. or p. b. > / 

1 .^1 0-X. 1 (tY - 

. ^ , * . • ' , ^ . . . r 

^ The consta»nt, bgXg, for the multiple regression equation is found follows: 
^0^0 " ^Mean " ^^l^lMean ^ ^Z^^ZMean ' ' ' ^N-^NMean^ / . \ 



Interpreting Multiple Rggressian Analysis 

From his regres^on analysiis*,^ an investigator ordinarily is interested in 
obtaining the following information; 

• An equation to be used to estimate performance at specific coordi- 
nates of ^the experime^ital space*. tj, ' 

• Measures of the relative importance of the experimental v^ri^bles. 

This information is generally easy to obtain when Orthogonal designs aje 

used. However, this is not the case whe^n results from undesigned (non- 

orthogonal)r experiments are to be interpreted. Let us examine both cases. 
V ' , • . 

a 

Orthogonal Designs . The equation cierivfed from a multiple regression 

<^ . ■ , ^ • 

analysis might appear as in the case of this fictitious axamp]^^ 

Y = 0.45 + 1.53X^-8.49X2 + 0.67X3 



where Y* ia the estimated number of*itargets found as^^i function of X^^, 
dynamic range Qf the display iri dog foot lamberts; 1^2^ .'^'Sensor resolution in 
10-foot tinits; and X-,, display size (diameter) in iijpiies. With an orthogonal 
Resign, the cbefficientg in the equation can be int^prejbed as follojjws. Each 
time the dynamic' r^tege on the display increased' one log foot-lambert, ' 1, 53 
n:iQre4 targets were fdund on the average. Each time, the resolution t)i the 
sensor was increased by 10-feet, 8. 49 feWei* targe^ts on the average wer^ 
found. Each time the diameter of tfie display Was increased an inch, more 
targets wdre found on ^e average, understoipd, as in all re-gression analy- 

s^a, that these relationships hold only within the boundaries setby t^e data 

collection points in the originali experiment. ' 

: . ^ ' . ' 1^. ■ ^, . . ■ - ' ■ ■ V ■ 

In equipment design problems, this* information may be enough to com- 
pare, the relative importance <yf the different predictor variables. Since the 
levels of all three vari^les can be converted to a common scale of engineer- 
ing costs (to ac*hieve a particular resolution, dy^^^imic range, oij 'size),* no\ 
refinement ofJ:he e equation is actually required ta decide their relative impor- 
tance in the appj.ied situation. This is not often the case'' in other fields of . 
psychology where no common base among variables existsi Thus it would be v 
quite difficult to know which contributed more to §/liccess in school by study- 
ing the raw score. coefficients in an equation that relates school success to 
scores on- a reading tes,t and a math test. There is no commoti base to work 
with. In that cas^e, to compare the^ relative importance of the ijjdJ^vidual vari- . 
ables on the performance, the predictor variables must be changed'^to. standard 
scores and the equation written in a standard regression forin. 

For example, a multiple regression analysis of the correlation table, 
T.2, wQ,uld yield the |ollowing^standard regression equation: 




Y = 0.342^^- 0. 167 Z,, t 0. 523 Z^* 

Z 1 C 3 

■V. ■ ■ ■ 

The effect of Variable 1 on performance is twice as large as the effect of 
Variable 2 (in te,rms of 'their standard scores, Z), However, it is questionable 
whether or ndt the use of standard coefficients is as meaningful for aPn applied 



problem that <^an be relatkd to a common cost- sca-le a^s the co'efficients of 
the ordinary regres'sidth equation. would be, * ; 

Calculations, however, can be made frdm the coefficients ofthe standard 

' /■ ■ ' i ' ' > ■ 

regression equation -that might add to the understanding of the data. When 

derived Trom fuUy^orthbgonal designs, - 

■.. " . ' ■ ' ■^ . ■ . ■ 

* . 1',' Thase coefficients are the same as the linear Correlation betw'^een * 
^ ea^h predictor and performance, as seen in the XY column of tl^e 

table of intereorrela/tion. ^ / \ , • v 

^ ' . ^ ' r . / • ■ " .. . 

2. The squa-re of each' coefficient shows the proportion of the total 

variability i+i performanpe^that each predictor accounts for... 

■ ■'" • * « ■ - - ■ ■ 

3. The sum of the squared coefficients 'i&ho^s the proportion of the total 
performance yaria^bili^y that can be explained by the total standard 
regression equation, and one minus that value ^hows the proportion 
that^ is not explained, 4 . . 

■ ■ ;^ . ' ■• ^ ■ ■■• . ■ / ■ 

Non-Ortho^onal Designs-, While regression eqxiations'f rom undesigned 
(predictor-correlated) experiments are mathematically the same as those 
from Qrthogonally^esigned eitperimentp, pragmatically they are not. Although 
in both cases the overall equation does represent the best fit of the data 
o(according to the least squares^ criterion), in the case of data from undesigned^ 
experiments , the iDeta coeffitients of individual terms should not be consi- 
dered independently. However tempting it may be to do so, when predictor 
variables are markedly correlated, the beta coefficients should not behind!- 
vidually interpreted to show^ the relative importance of the variables. *The 
relative magnitude of these'' cpefficients are the^esult in part of arbitrary 
decisions made by the investigator during the analysis. This can best be 
explained by example. ^ * . 

In figure F.l, two factors, and X2 account for 25 and 36 percent, 
respectively, of the total variability in performance (Y). ' A standard 
regression equation based on these two factors alone would be: 



V 




PROPORTION OF 
PERFORMANCE VARIANCE, Y, 

ACCOUNTED FOR BY TWO , 
INDEPENDENT PREDICTOR 
VARIABLES, AND X^- 



v[F.ll 




. " • P ROPOR TI0||, OF* 
PERFC^RMANCE VARIANCE, Y, 

ACCOUNTED FOR BY TWO 
CORRELATED PREDICTOR 
VARIABLES, X^^AND X2 



[F.21 



Interpretation in this case is straight-forward. The relative contributions of 
each variable can be estimated; thirty-nine percent of the performance vari- 
ability is still left unexplained. , - " o' 

* ' " ' '■ f . ' 

In figure F.2, the two factors, X^ and X^^ again overlap Y by 25 and 

36 percent,. Respectively. This time, however, they are also correlated 

0. 60 with one'another. It is no longer a simple matter to decide how much 
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of an effect each variable has on performance. Where X, overlaps and 
Y, how can one determine whether the effect on Y is due to X, or X^ If the 
effect on Y in the overlap portion is due to Xj, then X2 does not have as 
inuch of an effect as the simple correlation between X^ and Y suggests. If 
the effect on Y in the overlap portion is actually due to Xg, then Xj does 
not have as much effect as its correlation with Y suggests. Because ^ho 
data itself does not directly suggest which alternative is correct^' using 
regres/gion analysis on data with correlated predictors can giye a number of 
solutions, depending on the order in which variables are introduced into the ' 
analysis. ' ' * - 

In the above example, if the effect of X^^ {including the XjXg overlap) 
wei'e removed first, only .14 percent of X^ would be left (excluding the X1X2 
overlap) to affect Y. In that case, the equation woul^ be written: 

■ • • ■ , . \ ^ -O- . 

On the other Jiand, had the full "effect of X2 .be^n rJkioveid first, then the ^ 
effect of that remained after taking into consideration the X1X2 overlap 
would have been reduced.and the equation would have been: . ' 

Both equations would estimate Zy' equally well, each accoijiwit^bggfbr 0. 39 of 
the total variance. In both equations, the first beta coefficient coitresponds 
to the full correlation between that variable and performance; the second 
beta coefficient, however, corresponds to a semi-partial correlation after the 
effect of all prior variables has been removed from the |>redictor under con- 
sideration. As the number (N) of 'correlated variables increases, the dumber 
pf ways in which they can be ordered into the equation (^I ) illustrates the 
numerous solutions that are possible and why interpreting the individual 
coefficients is a meaningless exercise. 
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Because of this v there are sorxy^ like Darlington (1968, p.' 169), who ^. 
^ o • , 

after reviewing the problem at some length, concludes: "It would be better 

to simply concede that the notion of 'independent contribution to variance' 

■^-4i^F^iSb meaning when predictor variables/ are. inteifcorrelatedv " 

. Eigenvalues ' 1 



Given a correlation matrix A, such as table T.3 (br any real'syrmnetric 
matrix), 'there exists a set of eigenvalues, X., such that: 

^ I \i -.a| =0 • • 

\ l\ . V ^ 

, For a four-variable study^ the determinant of thfe correlation (a..) matrix 
in the above expression could b^ written as fbllows: 



\.-a 



11 



-a 



12 



12 



13 



14 



22 



^13 



'23 



-a 



14 



^24 



■^2iWf33' "^34 



^24^ 



■3^4 



\-a 



44 



= 0 



The expansidn of this determinant yields a polynomial ^(\) bf a/ degree i) in* 
X which is known as the cha,racteristic polynomial of the ma^fei^x A. The 
equation, <t>C\) 0, is called th!e characteristic equation of/V and its roots 
X.J, • • • ^jyf - 4 called the characteristic roots (or eigenvalues) of A. 
^ For^the purposes cf this report it is not necessary for the reader to under- 
stand the xnathematics required' to calculate eigenvalues since evefi for a 
matrix of modest size, a computer^ould be required to perform the calcu- " 
lations. It is important though that the reader be aware of some of the ways 
they can be used to facilitate the interpretation of data from the undesigned 
experiment. , ^ ' 



^ The set of eigenvalue©, for an orthqgonal matrix (e. g. , T.2) would ^. 
all be equal^to one. This should be obvious from the above explanation, . 
since an orthogonal correlation matrix is a unit matrix with all ones in the 
"diagonal wJiich would yield a determinant eqvial to zero only if the were 
also all ones. ' *' ^ * 

For a non«orthogonal matrix, however, the eigenvalues are no longer 
either equal or necessarily one. Instead, ^ some of them are larger than one 
and some„ smaller than one. The more non-drthogonal the design matrix, 
the greater the range of values. For example, the eigenvalues for a fictitious 
moderately non-orthogonal design of eight variables might be as follows: 

" ■ ■ • = 1. 55 . ■ ^ ' 

a = 1.36 



1. 15 
1. 03 



= 0.97 • [T.4] 

. • « = 0. 85 

= 0.64 

= 0.45 

while the eigenvalues for a fictitious more severely' non-orthogonal design 

♦ ■ 

'of eight variables might be as follows: ^ • 

• \, s= 3. 22 - . 

^ = 2. 18 

\^ = 1.30 ' 

, X.^ = 0. 74 

K = 0.31 ' [T.5] 

■ K = 0. 18^ 

- . \^ = 0.05 

= 0. 02 . . 



8 



Note how the range has increased in the second case, T, 5, and how small 
s<jme of the eigenvalues are. ' Both sets sum to 8.^00. ' ' • 

.Given the set of eigenvalues for a matrix, however, an investigator can ^ 
use them as a means of better understanding his data. The following appli- 
cations can be made. ' 
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■ .. * " * 

,The sum of the eigenvalues Vf^ill always equal N, the number of predictor 
variables in the experiment, whether the matrix be from a designed or unde- 
signed experiment. However, with undesigned experiments, since soiiiie of 
*the eigenvalue @. can be very small fractions, less than the total number-of 
^ eigenvalues may be needed to. almost sum to N. This can provide a clue as 
to how many criti^^ij^^^^^*^^®® actually influencing performance. For 
example, in the set of eigenvalues for a severely non- orthogonal design, T.5, 
99 percent of the variation iS explained by Ihe first six eigenvalues, Although 
the fact that no eigenvalue is zero indicates that all eight variables have 
some effect on performance, but for all practical purposes, only six are 
probably really critical.'' This could be intportant to kno«' if an investigator 
wished to eliminate seme of the terms in an equation. (Note: There ,is no 
one-to-dne relationship between the numerical ordering of eigenvalues and 
variables. ) ? 

The sum of the reciprocals of the- eigenvalues is an indication of the 
degree of matrix non'-orthogonality. This value for a completely orthogonal 
design, of course, equals to the number of predictors, N. The more corre- 
lated the predictor variables, however, the smaller some eigenvalues will' 
become and> therefore the larger the ^m of their reciproc?ils. Xhis sum 
divided by N shows how many times greater the squared distance is between 
: sample (estimated) "^and population (true) beta coefficients for the non- 
orthogonal design than"it would have been for an ojrthogonal. design^ 

The product of the eigenvalues equals the determinant of the matrix.^ The, 
larger the determinant (up to N for an orthogonal matrix), the more orthog- ' 
qpal the design. Later in this report,^ the determinant will be used as a ^ 
criterion for selecting the coordinftea of data collection points, which when 
added to the conditions of an undesigned experiment, will make it more 
orthogonal. 
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IMPROVED METHODS OF HANDLING LTATA FROM NON^ORTH OGOkAL 
PESIGNS ^ ~ ' ^ ^ ■ ^ i 



To extract the most information frpm the undesigned experiment, full 
advantage mji St »be taken of any technique that can offset the proble^ms associ- 
ated with'^isv class of experiment. In the last two sections of this report, ' 
two approaches^will be described that are superior to the more conventional 
techniques in popular use today. These approaches involve: 

• Collecting additional data at specific coordinates of t)ie experimental, 
space to improve the orthogonality of the design.'^^ , ^ 

• Usin^g "ridge regression'^ analysis to provide more sta^ble^and 
meaningful regression coefficients with which to fit^the data from 
aon-orthogonal experimental designs. 

Conceptually, the seVechniques are relatively easy to understand^ Impl,e- 
menting them, however, will require the talents of the investigator, a com- 
puter programmer, and possibly a statistician. In all cases the only practical 
vSay in which these techniques will be employed is with the aid of a high-speed 
computer. In the^body of this report,, no detailed disb^ussion of the. computa^ 
tions reiquired for the analyses will be giyen. However, in "the appendices • 
both general and specific references regarding the qomputatiornal efforts are * 
supplied along 'with listings^ of complete programs* When these are not 
sufficient, the reader is encouraged to refer to thp original papers. 



% , ^ * SECTION III 
COLLECTING ADDITIONAL DATA TCT ORTHOGQNALIZE THE 
^ UNDESIGNED EXPERIMENT . / ' 

*■*> '. 

The non -orthogonality of the undesigned experiment complicates the 

* ^ it* 

interpretatipn of results. In this section, methods of collecting additional 
data that will alleviate this ^situation are proposed. Specifically, information 
will be provided here to tell the reader: o \ -o* 



• ;How its select the coordinates of new data poin^^ that will improve 

the orthogonality of the original -design. . - 

• How tp-haridle irrelevstnt shifts in perfo^mamce that may occur 
between >the time when data ia collected on original and subsequent 
runs; ; , ' 

PRACTICAL' CONSIDERATIONS " . 

Since most undesigned experiments ar.e those in which the experimenter 
has little or no control over the levels 6f his variables, it may appear* p1re- 
sumptuous to suggest an approach that requires just such control. The poini 
in fact is thsit th^rre are circumstances when this approach can be used and 
an investigator^ should be aware that such an approach exists and be prepared 
to use it should the occasion arise. Sometimes, if only avfew additional 
points are needed, an investigator can make a Concerted effort to set up the 
required conditions in a way that would not be justified for an entire experi- 
ment. At other times, once the principles involved in adding points are 
understood, experimental conditions that are hot locatecf optimally can be 
considered which will still improve the orthogonality of the design and th^e 
interpretability of"tft'e>^ta. All in all, the knowledge of l>ow to properly add 
data points is a usefful\experime.ntal tool that has applications beyond the 
immediate pfbble'fer. 



Other useful applications of these techniques for the design of exp.erimeUt«s 
'are qdted in Appendix E. ^ ' I' 
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>,^; factors that must be taken into consid^ation before this technique 

is employed include: - 

^ • Computer faci^ties must be available because of the amount* and 
complexity of the computations required. 



Variables should be measurable on quantitative and*sf ontinuous 
scales. 



• The added costs of data collection must be "weighed against any^ 
anticipated improvei^ent^in data interpretability. ^ 

^ . Little effort is^ade in this report to help an investigator select, which 
alternative method he should yise his particular probletn. Nor is more ^ 
than a superficial effort made to identify and handle special, problems that 
might arise uniquely in behavioral research. 

SELECTING NE W DATA POINTS TOjMPR'OVg DESIGN ORTHOGONALITY 

Adding additional experimental cojiditions at the proper coordinates 
within the experimental space can reduce the non-orthogonality of an unde- 
signed experiment. When an ill-conditioned design can be* repaired this way 
sufficiently, the data. may be interpreted with the finesse ordinarily reserved 
for data obtained from orthogonal designs. Improved orthogonality depends ' 
solely^ on the location of the experimental conditions and is independent of ^ 
the responses obtained under those conditions. . ' 

Two methods of selecting these additional data points have been proposed. 

These are: ' • 

... * ^ . 

„ • Search the entire region of interest in the experimental space to 
•r find one .or mo f'e points that sa,tisfy the selection criterion. 

^ Examine a group .of "candidate" points to see which one best meets 

the selection criterion. .^ ^ 

• ^ ^ a ■ ■ ' _ . 

The first. will be called the '''random. search approach" a^id the second, the 
"candidate selection approach". ^ 



The Selection Criterion ' . . • 

With either approach, given an initial set of non-orthogdnal conditions, 

• the orthogonality of. an Qpgperifnental design will be 
improved if th^ next condition is chosen at the point 
in the region of interest where the variance of the 

. fitted response, V(y), is largest. ^ • . \ 

Wheti data has been collected within the experimental space in^isome 
nOn- systematic fashion, the precision of^the data througjiout the continuous 
response surface will be irreguiarj? with greater precision naturally lying in 
the vicinity of where the greatest amount of data was collected and vice 
versai^ The selection criterion says that to improve orthogonality additional 
data should be collected at the point 'or points in the response surface where 
the precision is poorest, i. e. , the variance is highest. In Appendix A, 
methods pf discovering and measuring this point of maximiim. variance will 
be discussed. ^ 

When a data point is added to the non-orthogonal design at the point on 
the response surface where variance is highest, the following occur: 

• • The non-orthogonal design becomes more orthogonal. 

• The variance at that point is reduced. 

The design becomes jno re J,' rotatable'^ over a spherical region of 
interest. (A rotatable des|pi is one in which the variances of 
estimated values equidistant frG|n the center of the design will be 
equal. See Box and Hunter, 1*^58, 1, 167, ) 

• , The averall variance of the polynomial is reduced. 

t« ' The confidence regions about the regression coefficients are 
reduced.^ 

Mathematically, adding a new data point in the region of interest where the 
variance is largest also maximally ijpicreases the determinant of the revised - 
ol<i^4>}rus new points experimental design.matrix, AH of the effeets'cited 
above will ^1 so occur as a consequence of maximizing the determinant. 



Therefore, if it is practical to do so, ^selecting the point that will maximize 
the determinant pf the revised matrix could be substituted for the criterion, 
of selecting the point on the response surface where variance is highest. .V 
Computer programs for calculating determinants ara cited in Appendix A." ' 

Ragdom Search Approach * % ' * : 

4^. .. ' 
■y Hebble' and Mitchell (1972) proi,ose tha% a computer be used to randomly 
- search the existing experimental spaceV^of the original undesigned experiment), 
to find where the variance of the estimated reponse is maximum. When' found 
that point would be the next condition to add to the experimental de'sign. The ' 
.process is „then repeated, seeking the point where, the variance is maximum 
within the space now defined by the original plan plus the first additional point. 
A third point will „be added where the variance is maximum withih the space 
defined by the original and two additional points. This process continues 
until the.inVestigator is satisfied with the^egree of correction obtained. Once I 
. a sufficient numj,er of data points have been selected, the performance data 
can be collected. ^ / • ' 

' Hebble and Mitchell-(1972, p. 768) state: " "When there are not more than 
. two independent variables, ... we use a grid search procedure. When-the " 
factor space is of higher dimen.s ion, .. . . we fav^r a random search technique. 
We chose random seai^ch in preference to more 'spphisticated optimization 
•j)jocedures for the following reasons: (i) Th^ random search technique is 
easier to use,„ especiaUy when- the region of interest R is constrained in 
strange ways, (ii) We feel that the random search'teohnique can be most 
easily extended to the simultaneous considerations of several criteria. ' 

Ekamfile. Hebble and -Mitchell (1972, p. 776) show how their .random' 
search approach can ;5e applied to repair the non-orthogonal design used in a 
chemical problem. . Four predictor variables were involved. * They had 
planned to use a third-order rotatable design requiring 81 runs, but during 
,the experiment, some combinations were never ruh because of equipment 
limitations. As a result, the orthogonality of the design was destroyed. Z 
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Ta repair the design, they added five new design- points using the random 
search approach. The change in the "infdrmatioti contour" of two dimensions ' 
of the response surface before and after the extra data points were'.added is 
shown in figure] F. 3/ TKis contour of constant .infoi:mation, I, inversely 
related to the.* variance contour, i.e., 1 = o"^/V(y). The lmpE.ovement in 
rotatability after the points had been added is visually obvious. There was 
a corresponding improvement in the other qualities affected by adding points 
at' the maximum y(y) which also maximizes the^determinant of the augmented' 
desigii. V ^ 
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ILLUSTRATiOMG THE IM'PROVEMEN':^^ IN ROTATABILITY^IN 
^ VARIANCE CONTOURS AFTER FIVE DATA POINTS HAD 
BEEN ADDED WITHOUT BLOCKING 

• ■ • t> ■ * * 

, - - Original impr'oved 

contours contours - 

[Adapted from* Figure 6 in paper by Hebble'and Mitchell (1972)] 



Precaution , The original purpose for adding more data points'to the 
original undesigned experiment was to improve the orthogonality of the design, 
which in turn could facilitate the interpretation of the results. It should be 
noted however that although using the -maximum variance criterion does 
improve the design for that purpose, it does not necessarily provide an 
optimum design. Hebbie and Mitchell (1972, p. 778) recognized this when 
they wrote: . . in many cases, 'bias ' caused by fitting an inadequate model 
will be a more important source of error in the fitted^response tha;i will 
variance. J' Bias error can be present, for example, if a higher order 
relationship exists in fkct between variable's and performance but these * * 
* effects cannot be isolated by' the .existing experimenl:al design. Because the 
' bias criterion would be a more difficult one to meet, Hebbie and Mitchell 
ignore the problem. In the next section, Dykstra suggests some ways of <» 
ineeting it. 

Candidate Selection Approach 

Dykstra (1971) proposes that instead of searching randomly through the 
rjegion of interest for the point where the estimated response variance, V(y), 
is maximum, a group of ca^ndidate points should be selected on sjome rational 
basis. Then the V(y) of 'thei^ candidate points would be calculated for the 
existing design and the onq with the largest V(y) would be used for the next 
run. Candidates would continue to be evaluated this way for each sucQessive run. 

Of coursi, none of the candidate points will^necessarily be located 
precisely , at ^he^point^on the response^ surface where the V(y) is maximum. 
This makes the results somewhat less accurate initially than the random 
search approach. Howev,er when a series of rur^& is made, the appj^oach 
becomes self- correcting. One advantage of this approach over the random 
search approach is the reduction in computer" time. ^ 



Designs that satisfy both bias and random error criteria have been proposed 
''^ by Box and Hunter (1958). Tpsts of tlie goodness of fil^of a specific model 
•are applied. If the fit is found inadequate, data points that will enable a 
higher-order to be fit are added to the original .design. (See Simon, 1970 
and 1973. ) 
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Selecting a Set of Candidate Points « There are a niimber of ypracticial 
consideration§|3^.ffecting the rational selection of a set of candidate points^, 
Fot example, the investigator would a^oid selectin'g points: 

• . That are not feasible to run. ^ 

Where no response is likely to occur. . 

Thus unlike the random search technique, the use of rationally selected " 
candidates permits thfe- experimenter to impose his judgment onto the 
mathematical criteria by selecting points of practical interest as well, 
This enables the number of different levels of each factor to be kept 
reasonably low, an important consideration when changing >el5cperimental 
conditions is difficult or time con^iuming, ^ ' ^ 



r 



Dykstra (1971), by choosing candidate points, is more able to attack the 
problem of eqviation bias that Hebbie and Mitchell ignored. The candidate • 
points should be selected in a way that not only iipprov^s orthogonality and ^fcps^ 
the associated reduction inyariance but also develops fnto a design bf a model 
that will adequately fit the datai? ^He suggests the following: 

'^M choosing specific combinations, however, one should be 
gi^ided by the model. For a first-order model tjie procedure 
will select points at the extremes of the experimental space, 
so that only corner points need be specified as candidates. 
For a second- order mo'del^e list of candidates should 
include the axial points and a center point, in addition to ' < 
corner points. A cubic model should have the candidates at " 
' four levels of the controllable variables, and so^n. (p. 684). . ^ 

Selecting the Point for the Next Run . Given a set of candidate experi- 
mental conditions, the one selected to rutif next is the one that gives' the 
highest variance for the estimate^ response at that point, i.e., where the 
value of V{^) is greatest, or y/hen added to the existing design, maximizes 

the determinant for the augmented matrix. Each additional cdndldate point is 

* ■ ■ 

selected sequentially in the same way until there is a decision to stop4 The 

nixmber of candida^e^points to be used, ^alth^gli at the discretion of the - > 

experimehter, may Ij^e based partially pn ^e number required to meet the 



c^iaracteristics of me m6del and jjartially on the improvement needed in the 
precision of the equation." 
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Mitchell (1974) proposes his owti algorithm "DETMAX" for design 
augmentation which searches for complete subsets of candidate points that 
will optimize (almost) the determinant pf tiie X'X matrix. He states that this 
method will give higher values of the IX^Xl than Dykstra's one-point-at-a- 
time approach, but admits that the latter "is seldom far off, and takes • \ 
much less-time on the copputer, In^many practical sit^ations, when the 
^object is to find^a good;(not necessarily 'optimal') design quickly, the 
sequential procedure will be quite satisfactory. " (p/206), ^ v 

Example . Dykstra (1971) improved the orthog|onality of a 20- run ^ 
'^undesigned'-' experiment with two correlated predictors by sequentially 
adding six out of nine candidates needed to improve a^second-order design. 
The nine candidate points were the foul* corners of a square, the four 
extremes of the axqjs, and one center point, with the non-center points 
placed equidistant from the center of the space. The changes in the variance 
contour before and after several degrees of augmentation are shown in fig- 
ure/F. 4. The shift toward a more rotatable design is visibly obvious. 
. After 26 points had been added, the d^erminant of the augmented design ' 
for the 26 points is 3.39 x 10 times larger than it had been for the 
original 20 points, 

I^JMDLING PERFORMANCE SHIFTS BETWEEN ORIGINAL A ND ADDED 
POINTS ' ' ' " V' ' 

Characteristically^p human performance research, if experime.ntal con- 
ditions are measured sequentially, changes in performance may be observed 
that are not due to the experimental variables. If there is a considerable 
interval between'the time the performance^data are coUecteci from the original 
undesigned experiment and from the additional points, unexplained and 
undesired performance shifts liiay occur, This^can be due to changes' in the 
subject, in the environment, in the equipjnent, or any number of unknown • 
factors. In any ca4-e, unless this shift in performance betw:een blocks of 
data is dealt with properly, it will distort the information of interest. 
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CHANGE IN VARIANCE CONTOU-RS FROM ORIGINAL 20 EXPERI- 
MENTAL CONDITIONS (FIG. 1) AS TWO (FIG. 2) AND SIX (FIG. 3) 
DATA POULTS ARE ADDED ^ , 

[Adapted frorri^^igures 1, 2, and 3 in the paper by Dykstra (1971).] 



This blocking problem can be handled in two ways, 

• By including a blocking term in the regression model. 

• By adding data points in balanced pairs. 

• ■ • 

Adding a Blocking Term ' ■ . o . • S 

Hebble and Mitchell (A^IZ, 771 ) suggest that a blocking term be 
incl\lded in the regression mfigfel to account for a possible difference in 
overall response level between tfie initial design.and the runs that are chosen 
to augment it. They say (pi^771 ):;^^'.-, , . .when a constant is ^ready in the 
model, we can account for a t)08sible block effect simply by. introducing a 
'dummy variable', which takes the valu-^. of 0 for each run in the initial 
design and a value of 1 for each additional run. When this is done, the model 
for the original design is unchanged by the introduction of the blocking 
variable. Thus, the first new point ill the design can be selected without 
introducing a blocking variable. To select further new points, the blocking 
; variable should be included, andf.a 1 should appear in the blocking column of 
the.X matrix in every row which is part of the additional block of runs, " Thev 
note that different data points will be selected when a blocking term is and is 
not included (See figure F.5). Generally, it i'S wiser to include a blocking 
term. Procedural precautions against sequence effects (Simon, 1974) a^s^ 
should be employed whenever possible. 

Adding Data Points in Pairs . 

? • 

Dykstra (1966, p. 279) suggests that another way of handling this problem 
is to augment the design with pairs of data point§(^ Orthogonal blocking will 
be obtained if pairs of data points are selected so that t^ie averages of th^ 
^ coordinates of these new pairs equal the average of the corresj)ondin|Pcoordi- 
nates of all conditions in the original desigftV For example, if the original 
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ILLUSTRATING HOW FOUR CANDIDATE POINTS ARE SELECTED 
DIFFERENTLY DEPENDING ON WHETUfeR A BLOCKING TERM 
IS (B) OR IS NOT (A) INCLUDED 



[Adapted from Figure 5 in the paper by Hebble and Mitchell (1972).] 
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design of three'variables had^^b^n made up of data points a1? the following 



^coordinates : 



Variable 



Data Points 


I 


II 


III 


No. 1 


3 


2 




No. 2 




3 


2 


I^o. 3 


4 


5 - 




No. 4 


2 




1 


Average; 


2.5 


3.5 


3.4 



Variable levels (or coordinate,s) 



^ Original design composed of 
four data points 



to be orthog(5^al, the t^vo new data points wo\ild have to be selected at points 
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for example: 








No. 5 


4 


5' 


2 




1 ; 


2 


4 


Average; 


2.5 


3.5 


3.0 



0 Dykst'ra in his 1971 article" did riot discuss this blocking method when he 
used the maximum V(y) criterion to find the coordinatfs where the next data^ 
point is to'be added: Hojvevet, it could still be used if candidate points were 
designated' in pairs and the criteriofi for selecting the.proper pair Would be 
that which maximizes the determinant of the augmented design. MitchelPs 
(1974) DETMAX, for example, might be used for this application. ^ 
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SECTION PV , 

^ RIDGE REGRESSION ANALYSIS « : ' 

The purpose of regression analysis is to obtain a set'of coefficients for 
an equation that will fit the existing data wilihout bias arid with a minimuin 
amouiit of variable.. error. The conventional criterion of a best equation is 
one in which the sum pf She errors ^q^aVed between estimated and observed 
"responses will be at a minimum. 

When an orthogonal design has been employed (i» the predictcxr 
variables are mathematically independent), the estimated beta coefficien'ts 
are reasonable representations of the true beta coefficients, within the limits ' 
sel^by the error estimate/ When a non- orthogonal design has been employed, 
the individual betas calculated on the basis of the least squares criterion are 
often unsatisfactory. While the overall equation may be adequate for pre- 
diction, the relative effects of individual terms cannot be evaluated* With 
non- orthogonal designs, beta coefficients deriv^d^fro^ a least squares fit 
may not make sense in th^ real world. 

Hoerl and Kennard (1970a, b) cite the following characteristics ^f 
coefficients estimated from ill- conditioned experiircental designs:,. 

' . ' ' ^ -} 

• * The coefficients become *too large in absolute value. '^j • ' \ ^ 

■ • ■ , ' ■ . " 

• Some coefficients may have the wrong sign. ' ^ . 

• CoUectivefy the coefficients are unstable'; another set of performance 
V data^w^ld be unlikely to give the same beta values. 

• Individual coefficients may be over under estimate?? of the 
strengift^of a ps^ticular factor. /. - 

The more npn- orthogonal the original design, the poorer the equation is ^* 
likely to b€^ 

. . All ofvthese conditions stem from the correlations among ^the pred!^ctor . 
variables. ^"In th'te past,^ij;i order to untangle-the relationship kmong the fa'^ctorSj 
it has either been necessary .to drop those predictors that correlate the 
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highest with the others or to treat the total equation as a black box,*^ However 
some of the powei* of the regression model is lost by either these approaches. 
As an alternative' tcS^conventional multiple regressiori (least squares) 
analysis with non- orthogonal davta, Hoerl an4 Kennard propose '^ridge regres- 
sion''. This ai^alysis, they suggest, will obtain a better- predictioi;i equation 
in which: ,^ 

. • The estimated coefficients will be closer to the true coefficients 
on the average; • u ^ - ' , 

• 4*he signs will be more meaningful; - . * 

• ^ point estimate of a response can be mad^ with a smaller mean 

square error; « . . 

\. " ' - ■ ° ^• 

The coefficients will be more stable and likely to be repeated if \ - 

new data is taken. ^ - 

MATHEMATICAL BASIS FOR RIDGE REGRESSION 

^' • . 

Hoerl and Kennard (igtOa, b) supply the matliematical basis for ridge , ^ 

regression analysis, O^ly the rudiments of their explanation will be supplied 
here. The reader should refej to th;e original papers if more details are 
desired. Marquardt ^1970) also deals with the matheir^atics of ridge regres- 
sion as part of a broader class of biased linear estimator^ employing 
generalized inverses. ^ 



Essentially, Hoerl and Kennard (1971a) show that in conventional multiple 

2 

ession analysis, the average value of the squared distance, E{L ), 
between the estimated, ^, and the true, P, T^eta coefficients is equal to the 
error variance, (r , oi 
the eigenvalues, i.e., 

E(l2) = (P - (P - P) = (r^ '2 (1 Ai) . - [E.3] 



2 

regression analysis, the average value of the squared distance, E{L ), 
La 

error variance", (r , of the data multiplied by the sum of the reciprocals of 



When the predictor variables are uncorrelated, the eigenvalues, X^, are each 
equal to one. In that case, the average squared distance b^t^^en estimated 
and true beta coefficients will be equal to the error variance of the data 
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multiplied by the number of variables, involved. Howeyer when the 
predictors^ are correlated, as in the case of the undesigned experiment, some 
of the eigenvalues become very ^mall and their reciprocals very larg^. This 
increases the average squared, distance between the estimated and .true beta 
coefficients; J^east squared^ fit of data from a non- orthogonal experimental 
Resign also produces coefficients that are too large in their absolute value. 
To compensate for these large positive jcoefficients, other coefficients are 
estimated that are too negative which often may be the incorrect sign. The 
moje^ill-cQnditionfed the design matrix, the worse these conditions are^likiely 
to be. ♦ 

To correct for this, Hoerl and Kennard propose to add a smal^ positive 
quantity, k, to the unit diagonal of the intqrcorrelation matrix of the pre- 
dictor variables.. For example, if the original intercorrelation matrix Cvere: 



Variables 



X. 



X, 



3. 

1.0 
0. 23 
0.45 
0.67 



Variables 
i 

0. 23 
1.0 
0.15 
0. 36 



0.45" 0.67^0.14 



0.. 15. 
1.0 
0.89 



;.0^6 0.26 
'ol^9 0.54' 
1.0 0.22' 



Correlations 
between XjXj 



or XiY 



thenyJie 


new matrix would be, 


for ex^Tiple, 
^2 -^3 


if k = 


0.2, 


wouldJfee : 




Xj 1.2 


0.23 


0.45 


0.67 


0.14 

1 




f . 


■ X2 -0.23 


1.2 


0. 15 


Q.36 


0.26 






_ X3 0.45 


0. 15 


1.2. 


0.8.9 


6.5,4 

■ * 






• . ' X4 ^0.67 


0.36 


0.89^ 


1.2 


0.22 





Note that the k = 0. 2 has been added to the 1 's in the^diagonal. Next, a 
conventional least squares fit is done using the perturbed matrix. The results 
produce whafHoerl and Kennard call "ridge coefficients, " (3*. The distinction 

35 V J \ 



ERIC 



\ ■ 



between the conventloml beta, (3, coefficients and the ridge coefficients, 
e^ressed in matrix algebra, is; . \ \ 



A 



p = (X'X)"-^ X'Y, 



and 



A. 



1 



Values" of k between 0 to- 0-;9 may be substituted with finer increments being 
used at tHe Ipwej end of the sfcale below 0. 1 where changed. in the estimated 
jri^ge coefficients are greater. Whereas the betas estimated from the con- 
ventional least squares are unbias^ with minimum variance, the ridge 
coefficients contain both a -bias and^a variable error. These two error com- 
ponents are present in the equation (written in matrijc algebra) for the ' 
average, squared distance between values of' the ridge coefficients and th© 
brue coefficients thus: 

E[l\ (k)]*= (r^?\./(\ + k)^ + k^|3'(X'X + kl)"^(3 . [E.41 



The first component represents the variance and the second^he bias. 
Note that when k = 0, the second component disap^earsj^' leaviijg the unbiase^d 
estimates of the coefficients found by a conventional Beast squares fit. As k 
increases, so does the bias error. 

However, Hoerl and Kennard demt)nstra|e that as k increases, the 
variance error decreases more rapidly than the bias errdr^increases. Thi^ 
means that, at some value of k, the mean square error — the combination 6f 
bias and error variance — for the ridge coefficients will be smaller than it 
would be for the conventional coefficients. 
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Exactly what has happened in this process' is simple^to understand if 
equations E.3 and E.4 are referred to. In equation 3, it can be seen that 
the small eigenvalues have the 'greatest impact on the estimations. The 
Smaller some of the eigenvalues get (as a result of a non-orthogonal design)^ 
*the larger their 'reciprpcals and the greater the squared distance between, 
estimated and true beta coefficients becomes. In equat^n E.4,^ it can be 
seen that adding a constant k to the c'orrelation matrix diagonal has the 
effect of adding k to the eigenvalue's of the variance compdnecit. ^For the 
very small eigenvalues, the addition of even a small k can do much to 
decrease the size of the reciprocals of the eigenvalues and to decrease the 
squared distance between estimated and true beta coefficients. 

^ This phenomenon is illustrated in figure F.6. Ih this figure, both the bias 

if 

squared and the varianc# of the ridge regression coefficients have been 
standardi'zed by dividing each by the residual error variance of the 
response data. The least squares variance (nopmalizTed) of the estimated 
beta coefficients is represented by the horizontal line (a constant) across the 
top of the graph. When k equals 0, of course, the variance of the ridge 
coefficients is identical to^the variance of the estirnated beta coefficients, 
and thia-.bias squared (normalized) is zero. As k increases, however, it 
can be seen that the variance decreases and the bias^ squared ihcreases, 
each in a monotonia, function. The^^ sum of theise two effects, the mean square 
error (as represented by the dashed "ridge" line), drops initially only to rise 
later on. There will always bt? for / some value of k a portion of the ridge ~ 
trace where, the mean square error is smaller than it would be' had no dis- 
tortion been introduced. In this example, the mean square error is at a 
^oinimum for k = 0. 05, - nearly half the magnitude of the origina-1 variable 
error. While there are other criteria than the minimum ridge value* for 
selecting the k where the ridge coefficients would be fou^id, this figure does 
illustrate how adding the bias ,can actually reduce the mejan square error, 
and thereby improve the estimates of the coefficients. 

Two computer programs for performing ridge regression analysis are 
listed ih Appendices B and C, a print -out of the latter is given in, 
Appendix D, and some discussion on both programs is held in Appendix A, 
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RIDGE REGRESSION MEAN SQUARE ERROR FUNCTIONS 
[From Figure 1 in Hoerl and Kennar'd's {1970a) paper.] 

INTERPRETING RIDGE TRACES 

One of the advantages of ridge regression analysis over conventional 
least squares Is the ability to portray the sensitivity of the beta estimates 
graphi|:^lly, A two-dimensional ridge trace of the ridge coefficients is 
obtained by plotting the estimated^ ridge coefficients against the values of k. 
This is illustrated in figure F.7, ^ / i 

38 , - 
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Hoer) and Kennerd used 
the coefficients at this 
value of K 
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RIDGE TRACE; TEN FACTOR EXAMPLE 
[From Figur'e 1 in Hoerl and Kejinard's^l970b) paper.] ' , 

The plot of solid lines illustrates how, as k increases, the , ridge 
^ coefficients diminish in, absolute magnitude and begin to stabilize. If k were 

to go to infinity (ad absurdum) , of course, the abVye processes woulA be 
complete, for all coefficients would be equal to zero. From figure F.7, it 
is'S|)parent that long before that p^iftt is reached the distance between the 
estimated and true coefficients would be too large to be of practical value. 
Quite obviously, therefore, it is necessary to select a minimum value of k 
--that will adequately provide an improved set of coefficients, ones that are 
more meaningful and will result in more ac^rate predictions. 

The dashed line at the bottom of figure F.7 is a plot of the residual sum af 
squares as a function of k. It is normal that^s bias is introduced into thfe 
design matrix, the lack of fit of the original data would become poorer 
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(i.e/, SSE become larger). ^It is only when the equation with the ridge 
coefficients is used to estimate performance on new data that the estimation 
has been' improved. • . . 

Hoerl and^ennard did not feel that an automatic — mathematical — 
solvation for selecting the best k was justified. They stated (1970a, p, 64): 

"The inherent boundedness as^mptions in using make it 
clear that it will not be possible to construct a clear-cut^ 
: automatic estimation procedure to produce a point estimate 

(ajsingle value of k or a specific value for each k), as can 
be constructed to pribduce p. However, this is no drawba.ck 
tp its use because with any given^set of data it is Hot ' * 
•difficult to'sel^ct a that is. better than pi " 

\ ' ^ ' , * " . ' ' ■ , • 

Tiiey propose t^t instead of .seeking, a mathematical solution for k," the 

ridge regression chart % examined visually. The following conditions should 

be looked for when selecting the 'value of k: - * ' 

- 1. . The beta values alid particularly their orders/of magnitude have 
begun to stabilise. ' ] , . • / , 

2. ' The coefficients no'IoAger have unrealistioally large absolute 

^ -value s.:. • • . * * • . , . ' 

3. The coefficients :witK^logically incorrect sl^ns are approaching or 
have rea.cfeed the proper sign. • • - 

- ' : 4.' The resid^aal sum of squares is not unreasonably |nhated. 

5. The ridge .trace (representing the m^kn square error) i^ smaller 
thaft the unbiased least' square variance. 

In the analysis -illustrated by figure F.7„ Hoerl arid Kennard (1970b) selected 
a k betweeji 0. 2 afid 0.-3^. Note how tlie coefficients, have begun to stabilize, 
how variables & and! 7 iiave redu^:ed considerably in":pagnitude (with 7 losing 
its effectiveness almost completely) ^/and how variable 5 has beguii' (but not 
ccrmpleted) a shift from a large neg^^tiv^ to a low positive .coefficient. 
At k = Ofdp', the >esidi:^al sum of squares (SSE^)-has increased approximately r 
60 ]percerit, from 0. 10 tp '0. 16 while-' the expected squared distance between 



of the coefficient estimate from true coefficient values has reduced to 
26 percent of its original value. 



ALTERNATE METHODS OF SELECTIf^JG K , • . ' i 

Techniques other than Hoerl and Kennai>^'s have been proposed for- 
^electing the'^iesired k value. ^Sevjsral investigators reanalyzed the data 
^irom Gorman and Toman's (1966) 10-variable study that Hoerl and Kennard 
had used for the^nalysis sho|^n yi figure F,7 in this report. For the same 
data, using different criteria, th% individuals cited below selected the 
following k values: • ^ 

(I ' * 

4 * ' . ' i' 

Basis of 
Selection 

Inspection* 
* Bayesian^ * 



Source 



Hoerl and Kennard (r970b, p. 72) 
Lindley -and Smith* (1972., p. 17) 



Value of k 

0. 2500 
0. 0390 \ 
^0.0200 
0.0029 



C plot 

P ^ 
Min. MSE 



Mallows (1^73, p. 672) 
Farebrother (1975, p. 128) 

Lindley and Smith (197 2) argue that since there is usually prior infor- 
maticfn about the parameters relating predictors to performance, this infor- 
mation should be exploited to find improved e-stim^^tes of the parameters. 
They apply Bayesian methods to linear regres*sion analysis argilin^s^hat in 
tlie case^ of non-orthogonal (Jata, ^he. Bayesian method reaches t-lfe same 
conclusion as th-e ridge method but has the added advantage m dispensing 
with the rather arbitrary choice of k and allows the data fo ^jy^nate it. 
Using Gorman and Toman's (1966) data, th^y compare the coeJ^feients 
'obtdjined by the thrfee methods — le&^st-squares, Bayes, and ridge. They 
note that the Bayes approach like ridge gets rid* of the threef major pom- 
plaints against betas obtained from least-squares —large absolute values, * 
incorrect signs, and instability. Comparing the .results of the Bayesian 
versus the ridge approach, they note that all th^ estimated Coefficients are 
pulled towards zeTQs with^ those from the ridge being smaller since "a 
^considerably larger value of k than the d^t*a suggest" (p. 17) was used. 
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F^rebrother (1975, p.*^t^)^oncludes from his own data and from a 
reexamination of Mallow's k value - which he thou*ght should have^'been ' 
stnaller - that ''Hoerl and Kennard 'sNquest for stability has led them too 
far from the unbiased estimator. " 

It is appropriate to remind the reader that each of the.above' investi- 
gators was applying a different criterion when, he se^ected'the optimum k 
value, and what may be best for one purpose may not be best^or another. 
For example. Mallow's (1972) criterion ("standardized total squa.red 
^ error") is admeasure that coml^nes both bias' and vauialDle error and he 

selects the k value where is minimum. This could conceivably correspond 
to the minimum ridge value considered by Hoerl and Kennard in figure F.6 • 
but which was nob used in sele^cting th^ k in the analysis shown in figijre F.7. 
Which is better? The difference might bq in whether one is mOre interested 
in a good prediction without too great an^ incre^is-e in RSS, in which case the ' 
mean square error or should be mininaiied or if one is more interested 
in comparing individual terms, in which case the stability of the individual 
coefficients became s rno^e important. Only experience is going to decide 
how the numerous criteria must be traded off against one another"". • 

McDonald and Schwing {1973) used ridge regression analysis on a prob- 
lem relating air pollution to mortality. They selected their value of,k (which 
was not necessarily optimal, in so far as the mean square error was con- " 
cerned)'according to three criteria, i. e. , at the point where: 

• The order of magnitude of the coefficients had stabilized; • 

• The residwal sum of squares and coefficient of determination had o 
values consist^,nt with problems of'that type; 

. • The ridge coefficients are within the 95 percent confidence ellipsoid 
for the unknown true coefficients, assuming normally distributed . 
errors, * , . » 

Newhous'e and Oman (1971) propose several methods. of choosing a k value 
to use in ridge regression and investigate" their properties using Monte Garlo 
experiments with two predictors. It appears that an optimal choice of k 
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(or interval of k values) i^B,n open question at tliis'^time unless one has 

prior knowledge about the length -and /or direction of the unlcn own coefficient 

vector. 
♦ 

Although the problem of how and where to select the best yalue of k 
has not yet been resolved, the overall superiority of ridge regression over 
least squares regression in the analysis of non-orthogonal data has not been 
seriously questioned. 

Theobald (197Z) independently demonstrated that provided 

k < 20-^/(3 '(3 

I 

the raean square^j^rror of the ridge coefficients will always be smaller than 
the least squares value of the conventional regression coefficients. For thiq, 
to be true, the values of the ridge coefficients must be bounded, a realistic 
condition. Theobald does not attempt to precisely locate the optimum value 
of k within the limits set by the equation, - 

; , . i ^ 

Banerjee and Carr (1971)' sugge^^ra different and "more meaningful" 
criterion against which to assess the accuracy of the biased estimator, • 
that Hoerl and Kenna3?d used. Hoerl and Kennard^ompared the size 
of the mean square error of the bias estimators (where k ^ 0) against the 
variance of the unbiased coefficients fronf the conventional leas4; square fit 
(k = 0) to show there always es^sts a k at which the new mean square error 
would be less yl^an the original variance. Banerjee and Carr however argue 
that it wojuld be more meaningfur to^ compare the mean square error of the 
biased estimators agaiiist a modified variance, ^ 

P 



E (p - p)^ squared bias = tr^ S \}^^\ ^ ^^J 



rather than the one Hoerl and Kennard used. 



.However, they also show.- t^iat even against this' modified criterion, there 
still exists a k where the biased estimators have a smaller mean square 
error, although the effect is" less pronounced. Banerjee and Carr suggest 
that the "gaip in accuracy may better be exhibited in relative terms, that is 
in teriris of percentages {or, fractions) of the variance' or rather than 
in absolute terms" (p. 898).. p . as used here re&rs tp the ridge coefficients. 

^ Goldstein and Smith (1974, p. 288) propose a modification of the ridge 
approach "which might be appropriate if dne were especially interested in 
^"a-cane particular p^'s, or were worried that the Ridge estimate inight distort 
the estimation of those p^'s which could be estimated accurately anyway. " 
They suggest the possibility of choosing different values of k for different 
predictor components. They disagree with Hoer] and Kennard-that this pro- 
cedure would offer little improvement over the use of a co!nstant k. It would 
depend, they claim, on what the optimum k would be for each component; if 
it differed widely, then an improvement in the mean squarfe error could be 
expected. ' < > ' 

IDENTIFYING CRITICAL VARIABLES 

When an equation is exceptionally |ong ^nd if many of its terms are 
found to be inconsequential, some investigators will want to drop the terms. 
In the case of the designed experiment in which variables are orthogonal to 
one another, dropping terms of i^t^^nificant effects is a straight -forward 
process. In the case of .the undesigned experiment, traditionally (beWuse 
of the intercorrelatipn among the vawables) dropping a term just because a 
coefficient is small would be' unwise. However, sincpe a shortened equation * 
is simpler and>more convenient and economical to use, a variety of algorithms 
have been devised to find the "best" subset of variables out of the^total con- 
sidered in the undesigned experiment that will fit the data about as well as^ 
the complete equation. 

Techniques for selecting -the ",best" subset regression equations have 
been primarily of two types: on^e, those that literally compare ^11 possible 
(or all reasonable )^ubsets of regression equations against some criterion of 
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goodness, or two, those with no exact criterion of goodness but which depend 

upon a heuristic algorithm that will supply a group of potentially good candi- 

dates from whi^h the investigator will select the J^best", Mathematical 

criteria for comparing subset regr^essions have traditionally been either the 

minimum error variance (which is fEe least squares fit .criterion) or a 

minimum total (bias and variance) error, with minor variations in the 
* 

exact form involved. Hocking (1972), Allen (1971), Helms (1974) and Beale 
(19T^ grit^cally reviewv these criteria. Since there can be .(lei - 1) possible 
subsets of can original eq-yiation with k variables, the main emphasis in 
developing selection techtiiques that compare many subset regressions has 
been to reduc^he computation time required for the analysis. Some recent 
efforts in this regard are those of Furnival and Wilson (1974) and LaMotte 
and Hocking (1970), Among the techniques employing the less exact criterion 
for selection, the stdpwise regression algorithm has been perhaps the one 
used most frequently by behavioral scientists analyzing nonorthogonal data. 
This and related techniques are di^cusged by Draper and Smith (1966), 
Chapter -6, by Beale (1970), >nd Kerlinger and Pedhazur (1973, pp 289-295). 

Ridge regression provides' an alternative solution to the subset selec- 
tion problem. By stabilizing the coefficients, it enables the relative import 
ance of predictor variables to be assessed more directly. Hoistda^^, in 



\ 



*^ther techniqu'^s, such as coftical equations, factor analysis, and so forth 
ha,ve been proposed for isolating subsets of variables. While these are 
imdoubtedly useful for certain purposes, they may be'of limited value for 
Certain equipment design problems. The reason for this is that solutions 
from tliese techniques result in composite variables. That is, ^ solution 
y will provide a set of mathelnatically independent variables which are 

mathematical mixtures of the original va^riables. Such solutions, while ^ 
^ probably useful in test co^truction or personality assessment, will , 
ordinarily not be adequateMor problems of equipment design. While any 
technique that aids ifi iliterpretirfg data should be considered, beware of 
relying on techniques that don't fit the particular problem under 
investigation. 
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spite of this, Hoferl and-Kennard (1970b) spepifically recommend that factors 
with small coefficients not be dropped fifom the equation, Instead they 
recommend the following procedure when some variables have small * . 
coefficients and are believed to have small effects: 

. ' • , ■. ' 

'To 'discard' a factor, set it at its average value for all 
predictions, which is the equivalent of setting the 
coefficient equal to zejro. But do not delete and 
reestimate . . . . " (p. 75). 

The average valufe for any predictor is the mean of all levels of that 
predictor used in th.e experiment. They demonstrate how. eliminating 
low eftect predictors completej^j^^n result in an even more unstable solution 
than when all predictors are retained. * * 

APPLYING RIDGE REGRESSION ANALYSIS TO A TARGET ACQUISITION 
PROBLEM <^ -V ' y 

•> ♦ ' , 

Zaitzeff (1971) at the Boeing Company, Seattle, performed an undesigned 
experiment to discov^er the function relating fifteen .selected target and baClc- 
groxind characteristics (table T.6) to the probability of acquiring targets. 
Observers were requirec^ to find a variety of targets .visually in a- dynamically 
changing scene. The empirical data thus obtained was. subjected to regres- 
sion analysis. ^ 

^ First a gtepwise regression was carried out on the data without stopping 

until all of the variables had been entered into the equatinn. The order in 

which they were entered into (or delated from) the equation corresponds to 

the order in whic^they are listed in table T.6. The first variable, "proba- 

bijity of finding a static target", accounted for more than 80 percent of the 

: ^ ^ ' \ , 

This recommendation must assume that the variables in th^cbmplete 
^ equation are there becafhse of some rational variable selection and not 
merely on the whim of an investigator who'd '*just like to see what would 
happen" if they were included. As Hays (1963, p. 577) says: "Tracing 
relationships among variables is the legitimate business of the scientist, 
t but simply ^s^ing if anything relates linearly to anything e;Lse in a large 
set of variables is a pretty crude way to do business. " This point is dis- 
cussed further in the jSaragraphs on "Data Selection" in Appfendix A. 
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VARIABLES IN^ARGET ACQUISITION STUDY IN 
* ORDER OF THEIR APPEARANCE IN STEP-W^E REGRESSION EQUATION . 



Order 
No, 




Variable 
^ID No., 


1. 


Probability of finding a static target (PSTAT) 


(11) 


2. 


.Number of filtered brightness eleme'nts in scene (NAVG) 


(15) 


3. 


TVT ^ 1 r ' ^ 

Number of confusion areas in scene (AMBIG) ^ 


(10) 


^ 4.. 
5. 


Small dimension (LIT DM) ^ ' 

' • ■ ■ /V 

Target width (LIT DC) . " „ ' ^ 


(9) 


6. 


•7 

Detail contrlast (DCONTR)r 
. Target length (BIG DC) , 


(7) V 


7, 


(8) t^-^J 


8. 


Area 1 Variance (VARARl) • > ^ 

* 


(14) . 


9. 


Target area (TGAREA) , 0 


(4) 


10. 


Target contrast" (TCONTR) 


(6) 


u:' 


Heterogeneity (HETERO) 


(12) 




Scan variance (VARAVG) 


• (16) 


13. 


Large dimdtes ion (BIG DM) . 


(2) ^ 


14. 


Detail size (DETSIZ) 


(5) 


B. 


Area 1 cour^t (NAREAl) 


(13) 

1 









In the step-wise process, . "Target Contrast" originally entered. the 
equation in the step following the entry of "Number of confusion gjtreas^in 
scene", was l§*ter deleted in step following entry of "Target lengtfi", and 
finally reentered in position indicated. (From Zaitzeff, 1971) 
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t^tal performan<^.e variance. All 15 varijBsles accounted for 93 percent o£ the 
variance. Seven variables would'havte accounted for 90 percent of th4 variance. 

4 

In examining the coefficients from the stepwise regressions, Zaitzeff 
commented on the unsatisfactory results: * 

\ "Thus, it is disconcerting to see relatively large negative 
coefficients assigned to such variables as Target Area, 
^ Detail Size; Target Contrast, and Target Width, when 
the factor analysis has shown thejn to be positively 
correlated with dynamic acquisition probability!' (p. 51). 

He also cites other limitations" of equations developed with the l^ast squares 
criterion.. ' « 

i - 

Next a .ridge regression analysis was carried out on the same data. 
The ridge tr.ace of the 15 target-background variables~are shown in. 
figure F.8. The instability of a number of the variables is immediately 
evident. For example, "Little Dimension" (^f3) changed from having the 
largest positive coefficient to one that ranks sixth, and "Target Area" ^#4) 
change^d from having the largest negative coefficient to a slight positive one. 
Zaitzeff decided on the basis of visual inspection that the coefficients 
were reasonably stable at^k = 0.7. In addition to ""Target Area", #4, ftoth 
"Detail Size" (#5) and "Target Contrast" (#6^ show a'sign change that 
appears more meaningful in the light of what is known about visual 
perception. « 



Va riable E limination . Zaitzeff eliminated variables that had stable 
coefficients but low predicting power (a "coefficient Iqss than 0. 05)'and ^ 
those witl^nstable coefficients that failed to hold their predictive value. 
He also eliiiiinated two other variables, "Big Dimension" (#2) and "Area 1"^ 
Count" (#13), which correlated highly positive with two other variables and 
were considered redundant. In addition, although "Static Acquisition 
Probability:" (#11), is shown to be the single best overall predictor of dynamite7 
acquisition probability, it was eliminated because it was ah unwieldy value 
to acquire and because Zaitzeff felt that it was actually a function of the 
other^^physical and psychophysical variables rather than a distinct target- ^ " 
background variable in and of itself. - . 

'•■tf « , . 
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VARARl 


7. 
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RIDGE TRAj;:E OF FIFTEEN TARGET -BACKGROUND VARIABLES 
[From Figure 5-13 in Zaitzeff's (1971) paper] 
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A ridgp regression was run on the remaining seven variable^ resulting 

in the new ridge regressipn pattern shown in figure F.9. . Zaitzeff selected 

k = 0,4 as the place where the coefficients appear to stabilize. This reduced 

set of variables accounted for 79 percent of the observed variance (as 

opposed to 96 percent when all variables^^re used). However, in the light 

of what we know about shrinkage and , the instability of the original coefficients, 

the dro^ is not disturbing. Zaitzeff also noted that by reducing the number 

to. only four easily attainable physical measures: ^'Detail Contrast", "Target 

» 

Contrast", and "Target Length" and "Width", 66^ percent of the observed 
variance could still bg accounted for. "Target Length" and "^farget 
Contrast" alone accounted for 48 percent of the variance^^^^)everal attempts to 
include interaction or tran§^enerated terms p«?oved less effective and were 
aborted. Zaitzeff did not follow the procedure for eliminating variables 
recommended by Hoerl andt Kennard. 

This study is one of the better ones attempting to relate target and 
background char,acteristLcs to target acquisition performance and illustrates » 
the advantages of ridge regression over stepwise regression analysis, 

] ^ * 



'The ridge pattern in-figure P. 9, showing priiriarily a relatively orderly 
compression of coefficient valuefs, is similar to the pattern found when 
the bias is introduced in an oi^hogonal design. 
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RIDGE TRACE iN TARGET -BACKGROUND STtJDY FOR ONLY 

SEVEN VMIABLES 

[From Figure 5-14 in Zaitzeffis (1971 )' paper ] 
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APPENDIX A ^ 

' SOURCES OF COMPUTER PROGRAMS N EEDED T O IMPLEM ENT ' . 
^ THE TECHNIQUES DESCRIBED IN THIS REPORT 

Computations used in the techniques in this report - to be implemented - 
will require the aid of a high- speed computer.. J^resumably the talents of the 
experimenter, a computer programmer, and possibly a statistician must 
be combined to provide the software required for the computations. In this 
appendix, references are given to source's of computer projgram'ls and sub- 
routines needed to support the techniques, along witft some general references 
on the mathematics involved. The original papers are an excellent place to 
begin to understand the computational requirements of these techniques. 
• ■ 

GENERAL REFERENCES 

Some general references'^ on statistical techniques, mathematics, and 
computer programs relevant to this report are: ^ 

Regression analysis ^ ' 

• Darlington (1968). Multiple regression in psychological research 
and practice i ^ ' ' 

• Draper and Smith (1966), Applied regression analysis ^ . 

• Hader and Grandage (1958). Simple and multiple regriession 
anabtf«is 



• Kerlinger and Pedhazur (1973). Multiple regression in 
behavioral research 

• • 

Matrix mathematics , , 

• Ayres, Jr., (19^2). Theory and problems of matrices 

• Draper and Smith, Applied regresl^n analysis 

• Kerlinger and Pedhazur ( 1973). Multiple regression in 
behavioral research. Appendix AT" 




Complete references can be found in the Reference list at the end of the 
complete report. - 
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Computer pro-ams > * . 

■ - n - • • 

Dixon, (1979^). BMD: Biomedical computer programs. 

JUG Com puter programs directory*'^^ 

^ — ■ . * «j> 

KerMnger and Pedhazur (1973). Multiple regression in ■ / ' 

%eliaVioral research . Appendices B and C. ""^^ y'. 

^Kuo (1972). Computer applications of numerical methods . 

Nie, Bent, and Hull (1970). Statistical package for th^ soc ial . 
sciences . '■ ' • 

NASA Computer program abstracts^^ , 

COMPUTER PROGRAMS FOR ADDING DATA POINTS * 

Computer routines will be needed to calculate the variance of the 
estimated response at a point or the determinant of the X'X matrix (e. g. , 
Table T,?), Random search or optimization routines are also required for 
that method ot ^adding dataf points to improve orthogonality of the undesigned 
experiment, 

137 \ 

Variance Criterion 

If the variance of the estimated response, V(y) at point x^ on a response 
surface, is to be used as the criterion, it can be cafculated using the equation 



Computer Program Abstracts is an indexed abstract journal listing docu- 
mented computer programs developed by or for the National Aeronautics 
and Space Administration and the Department of Defense, which are 
offered for sale through .NASA- s pons or ed Industrial Applications Centers 
and the Computer Software Management and .Information Center (COSMIC). 

Computer Program Abstracts is available to the public on subscription or 
by individual issues from the Superintendent of Documents, United States 
Government Printing Office, Washington, D. C. 20402, USA. Rates as of 
August 1975 for an annual subscription were: $3. 30 domestic; $4. 15 
foreign. . ^ 

Joint User Group (JUG) of the Association for Computer Machinery 
Computer Programs Directory was begun in 1971 arid updated several time 
since then. Its purpose is to exchange program documentatioh amang com 
put^r user groups. It is published by CCM Information 'Corp. , subsidiary 
Crowell Collier and Macmillan, Inc., 909 Third Ave., NY, NY 10022.* 
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employed by Dykstra (1971, p. 683), Draper gmd Smith (1968, p. 56) and 
others:. „ ' 



2 , . -1 
X = 0* x' A X 
o o o 



This expression, in-^atrix algebra, re^^uires the following operations and 
computer subroutines to perform the operations:. \ 



-1' - 1 . ^ • 

A" - X'X" . ' Ma^ix X Multiplication, 

X'X = A* \ ^ , • 

• - » ; ^ ' 1- ^ 

t Matrix A inversion, A 

X to * * Vector x transposition, 

^ > / horizontal to vertical*'*' , ^ 

* - 1 . ■ * » . ' ^' " 

x' A" X ' Matrix/vector 

* r ^ ^ Myltiplicati>n- ' 

2 

Since the o" in the above equation (i.e. , the error variance) is a constant, 'it 
need not be included if th^e equation is to be used only to compare various 
data points. 

Detei'minant Criterion • . 

Computer programs for calculating" determinants can be found in the 
general references cited above.^ Also most computer manufacturers supply- 
ing subroutine packages with their systems ^include programs for calculating 
the determinant of a matrix and eigenvalues. It must be remembered that 
the product of the eigenvalues of a matrix equals thq "determinant. The main 
problem of -selecting a program is not whether it calculat'es the de'si*red 
values but does it do it most efficiently. , • 
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X as used here is the matrix formed by the elements of the independent 
variables, such as in tables T.2 and T.3 in the text. 

Xq is a vector of values, e. ^. 4, 6, 3, 7^ which represents the levels or 
coordinates of four variables, A,^B, C, D, which thus describes the data 
point (experimental condition). 



The determinant is calculated and printed out in the ridge regression 
anajysi^ program provided in Appendix C. However, it ordinarily would not 
'be economical to use it to calculate the determinant as a criterion for select-" 
ing data points to repair an undesigned experiment. 

• ' ^ . . . 

Random search routin es " . ^ * 

• - . . " - t 

/ No general purpose search program is recommended here.' However, 
Mitchell and Millar (1970) enaplpy tl|e same principles fo construct D-optimal 
expe^rirhental designs as would be needed to add data-pointis to the- matrix of 
an undesigned experiment using' the' determinant criterion. D-op^i^l- 
designs are those ,for which the determinant of the X'X rriktrix is maximum. ' 
where^X is the matrix gf independent variables in the usual linear regression 
model. Mitchell (1974) describes applicatioti of the ^^Igorithm, DETMAX, to 
construct D-optimal designs. In'his paper, .Mitchell srtates (p. 209):-, "A 
FORTRAN listing of DETMAX is available on request to the Computer Sciences 
Div. , Math, and Stat. Research Staff, Union Carbide Corp. Nuclear Di\^ision, 
P.O. Box Y, Building 9704-1, Oak Ridge, Tennessee. " . ' , ' 

Box and Draper (1971) mention an optimizatton routine due to Powell 
(1964) of the direct search type, that maximizes the determinant. However 
in their applicatioji it was only, suitable for relatively small designs (np less ' 
than 30), Hebble and Mitchell (1972, p. 768) refer to a paper by Spang (1962) - 
for a general discussion of random search procedures. 

If the candidate approach is used to add data points, then there is no 
need for a random searclxprogram. Instead th'e variance at the candidate 
points or the determinant of the new matrix when each point is a'Sded to the 
original design can be determined and compared — which 1§ the.largest? 



COMPUTER PROGRAMS FOR RIDGE REGRESSION ANALYSIS 



^ A^ convenient method of jjieveloping a computer program .for ridge 
regression analysis. cam be (3bta|ned by modifying a\convenfional multiple 
.jregression program^ The bias, k, is introduced by adding a constant, k, 
to the unit diagonal of the correlation matrix and doing a least squares fit 
on the modified matrix. This process is iterated using differenfk values 
until enough, ridga coefficients are obtained to plot the data and select the- 
location of stability^ meaningfulnessv and so forth. 

< If the University of Calif9rnia .Biomedical Data Processing (BMDP) 
Manual^^ is^. available, then^a modification of the BMI>S?2R (Stepwise regres- 
siojri) program, developed by Maryann'Aill (1975) of the UCLA Health Sciences 
Computing FaciU can be used foa:' a .ridge regression analysis, "^he com^ 

plete ar1;icle 4'e scribing this'txio'dification has been reproduced in Appendix B^ 

.... ■ .\ ^ ^ . ' \ - ^ 

A more complete computer program for ridge regression was^repared 
by Mary G. Gallegos o^ the Display Systems and Human Factors Department, 
Hughe^ Aircraft Company, "Culver City, California, This is reproduced in ' 
Appendix C-along with a sample problem in Appendix D. The program, how- 
ever, was- written for a particular problem. and, as listed, has dikiension 
statements and other fea.tures ^specific to that problem. With relatively little 
^effort, a competent programmer could use it as a guide to fit other parameters 
and other computers. * ' 

PRjfcLIMirsfARt PREPARATIONS FOR THE DATA ANALYSIS 

There is no s.uch thing as a completely automated data analysis, only 
automated aids to diata analysis; • The .computer s are available to handle the 
routine manipulation of numbers, but they ar*e not intended to decide whai: 
m^nipalationS^are required, what assumptions are to be made, what data ]^s 
to, be-f^d into tl^m,' nor^how to interpret the output. These are the , ^ 
responsibilities of the investigator. 



The. BMDP Manual of statistical computer programs is available from 
University of California Press, 2223 Fulton Street, Berkeley, California 
94720 at $10. 00 per copy. ' ' \ 



Without sufficient background himself, the investigator — to employ the 
techniques proposed in this report — will need tiie help of. a competent com- 
puter specialist and statistic ian. However, while their technical aid can be 
of considerable value, the investigator must understand exactly what is being 
done by a computer performing an analysis and why and must not allow 
critical decisions to be^made for him. He is the only pne who knows the 
intended use of the data, its sources, and other critical factors. Any employ- 
ment pf outside talent should be a part of a team effort,"" with the experimenter 
in complete con|trol. 

Data Selection .^ . * 

I - o 

Before any computations are begun, a preliminary analysis of all of the 
available data should be made to be assured that all should be included in the 
o formal analysis. Particularly with the undesigned experiment%rhere variable 
upon variable can be added bV simply making more measurements (some- 
times after the fact), the dan^rs of a superabundance of inconsequential, 
variables included^by an overcurious investigator should be avoided, A pre- 
analysis ought to consider seriously the relevance of the variables under 
consideration, and even an examination of a table 'of intercorrelations could 
suggest which variables are mathematically identical and should not be 
included twice. * - > 

h 

Ansconibe*(1967, p. 38) has this to say about this matter: 

"In considering multiple regression with large numbers of 
potential 'explanatory' variables, I would like to echo ariW^^ 
extend Dr. Yates's remark on the value of under standing 
the x^ variables first, befcJre seeking to relate any of them 
to the y-variable. Put very briefly, I have never come, 
across an occasion where one wanted to construct a multi- 
variate relationS'hip without already knowing enough about 
the x's not to have to do a formal 'search' operation of the 
multiple regression/form. ^One must b*e extraordinarily 
. ' uninformed about one's subject-matter simply to wish to 
put all 'possible' variables into a multiplcr^gr es sion 
black-box and trust to least squares to s^^ them out. 
Modern technology may now facilitate almost incompre- 
hensibly vast multiple regres sion analyses at almost 
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incredible speeds, but this can only ser^ to verify the 

the dictum that the computer shows things to be unnecessary ^ 

which were previously impossible, " 

Gorman and Toman (1966, p. 27) dfsgiSfs the idea of a preliminary 
examination of the data in a slightly different way, thus: * 

"Before variables are selected the data must be examined 
^ carefully for statistical difficulties such as split plotting, ^ 

serious departures from normal distributioi^of residuals, 
serial correlation of residuals, and outliers, and for func- 
tional difficulties such as the improper choice of X's in the 
complete equation. Statistical defects are usually spotted 
by a careful examination of residuals after the equation has 
been fitted with all k variables present (1, 2, 3), The choice ^ 
of the X's and their functional forms (i, e. , Xj = 1 /T, 
X2 = log (SV), etc. ) is really a matter of technical judgment 
by experts in the fili^ from which the data are drawn. Here 
again, careful examination of residuals can expose improper 
choices of functional forms." ' 

There is just no substitute for the early application of intuitive judgment by 
an inv^ps^gator* who knows his business. ^ ^ 

Input Accuracy ' ^ 

■ - . f' : . ' o' : ' 

Before data is fed to the computer, it should be carefully checked for 
accuracy. When a great deal of data must be key-punched, it is alf too easy 
for mistakes to be made. Much frustration can be avoided if the investigatpr 
takes the extra time at the beginning to inspect a print-out of the input cards 
himself. It is amazing how easy it is for a person who knows how the infor- 
mation sholild appear to spot errors that would never be evident to a key- 
punch operator nor a less-informed technician. 

\ 

Program Precision 



F^r handling large multiple i/jegVession analyses, it is wise to request 
the computer be programmed to handle double precision arithmetic. The 
problem of rounding errors^ with serious consec^ericd's to the results., in 
analyses of this type l^s been discussed by Draper and Smith (1968, pp 14^-145) 
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and by Freund (1963). Neither of the ridge analysis programs listed in 
Appendices B and C is written in double precision. It has been pointed out 
that although the consequences of imprecision increase with ill -conditioned 
matrices, the very pJyocess of ridge analysis corrects the sensitive condition. 

^Thenj too, before the Consequences of single and double precision can be. 
estimated, it is necessary to know how many bits per word are involved, and 
this depends on the particular computer. Single precision for one computer 
may be more precise than double' precision for a smaller one. 

Draper and Smith (1968, p. 148) point out^^^ value of workin^from the 
correlatibn matrix. They say: "Transforming the regression problem into a 
form in which it involves correlations is good in general because it makes all 
oi the rm^nbers in the calctilations lie between and 1., When numbers are 
f 11 of this order the adverse effects of roundoff error are minimized. " 
Certainly avoiding sources of imprecision is a matter of prudence. However, 
the' value of double prec^on for the particular set of data must be weighed 
against the requirement for a larger computer memory and a possible limita^ 
tion on the amount of data that could be analyzed. <^ 
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APPENDIX B 
RIDGE REGRESSION USING BMDP2R 

'Maryann Hill 



in HMD COMMUNICATIONS . Health Sciences 
Computing Facility, University of California 
Los Angeles, February 1975, No. 3 



In a rcgicssion analysis when the independent variables are highly 
correlated: the 'data arc often said to be ill conditioned. The 
resuliiug repression coutficients may be quite unstable and not 
useful for future predictive purposes on a new . sample. Ridge 
regrCsflSSn is a technique that is used to **tamc" the estimates of 
rejiression coefficients, to portray scnsitiviry of the estimates to the 
particular set of data bemg used, and to oblam point estimates with 
smaller mean square error (althouj^h the estimates will be biased). 

Ig^^the regression model Y=Z/i + e, the ridj'c estimate of the 
coc f fie k'n t v^c 1 1) r (1 1 s 

0* = (rz+ Xir^Z'Y 

where Z is the matrix i<n cases b\' p variables) of the standardized 
independent variables ;.nd Y is the vector of ihc st»'iuljrdi/cd 
depenSeut variable. Ihc usual leasl squares estimate is obtamcd 
when \-0. 



Plotting the resullin:» coefficients for a number of values of \ lyves 
an indication of the stability of the coefficients. You hope to rind 
the value of X whore the coclficienrs bean to smooth ou'j and no 
longer make sudden changes (e.g., switchmy sijji^. TV.c eshmatvis ^ 
the coefficients eventually approach zero as \ goes to infinity. 



By adding "dummy'* cases to the end of the standardized data tile 
and using the zero intercept option (TYPI>0)j you can try this 
te/chnique with your own data using BMDP2R. The 'dummy" cases 
determine the amount added to the diagonal of the Z'Z matrix. .A.dd 
one^**dummy" pase for each of the p independent variables with 
>/(n-l)X as the value of the correspondmg variable and zeros for ihe 
remaining variables. Note that the Z*Z matrix is (n-l) limes the 
conelation matrix. It is useful to think of ridge degression in terms 
of the correlation matrix: the size of the value added to the diagonal 
elements of the correlation matrix is then comparable from problem 
to problem. JiT this context yalues of \ less than one arc of most 
interest. v ' 



Example; Hoerl (1962) discussed a ridge technique in an article 
dealing with the measurement of the performance of a chemical 
process. He specified a relation.shijf» between three highly correlated 
prpcess variables and a response variable, added random noise to the 
response variable and then analyzed the data. Although the specified 
relationship had all positive coefficients, the usual least squares 
solution produced inflated coefficients - one of which was negative. 



He then applied a ridge technique to these data showing the taming 
effect on the coefficients and producing solutions closer to the 
"true** values. ^ 

To see the effect of \ = .1 6 on th e repression coefficients for the 
Hoerl data, we compute V(n-l)^ - v/^Tb = 1.2 and submit the 
following cards for the HSCF system: 



// eXECBlMl:DTrPR0G=BMDP2R 

//TRANSl* DD ♦ 

IF(KASE.GT.IO)GOTO I 
X(l)=(X(l)-l.82)/.4022 
X(l)-(X(2)'l,a6)/.4088 
X(3)-(X(3)-l.88)/.4492 
X(4)=(X(4)-28.9)/4.02l3 
I CONTINUE 



Using the sample x and s 
to standardize the in- 
dependent variables, 
X(l),\(2).X(3)und Ihe 
dependent variable, \(^) 
for lO ea,ses 



/♦ 

//GO.SYSIN DD * 
PROBLEM TITLE IS RIDGE./ 
INPirr VARMBLllS ARE 4. 

^ i'oumatis '(4f4.1)*./ 
regression 1)1 pi nuent is 4. ^ 
entlr is .ool. remove iso. 
typeiszi:ro./ 

END/ 

1 1 1 1 1 1 223 
14 15 II 223 
17 18 20 292 

17 17 18 270 

18 19 18 285 

18 18 19 304 

19 18 20 311 

20 21 21 
23 24 25 328 
25 25 24 340 



^ 1 0 cusc*; of raw data 



i2 0 0 
0 12 0 
0 0 12 



} 



3 dummy cases 



/♦ 

// 



Inserting different values of >y<n-l )X and rerunning tlic program, we 
obtain and plot t\(\^c estimates of the coefficient for each k (see 
figure). (Note: A number of problems can be run Together and the 
BIMLDT procedure can be used to change the values of the du.umy 
cases in each problem.) ^ 
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PLOT OF RIDGE COEFFICIENTS 
for different values of \ 




You will also want to plot the residual sum of squares vefsSus \. The 
most desirable coefficients hopefully will correspond to that value 
of \ where the residuals have not started to increase rapidly, but yet 
the values of the coefficients have settled down. 



The deck SL»tup above produced this result for 
X=.Ifi (B,=.295, B,=.IS5, B,=.479). The least 
squares solution is marked at (B|=-827, 
B3=%56I;B,=.713). 
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APPENDIX C p 
• A SAMPllE PROGRAM FOR RIDGE REGRESSION ANALYSIS 

Programmer: Mary G. Gallegos 

(Original program was written by Charles Bahun for a GE-635 machine in 
Basic. The Basic program was converted to FORTRAN IV for a Xerox 
Sigma V machine, with subsequent modifications. Final program ^ze is 
11.8 thousand words, 32 bits per word. ) 
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##»##»#RIDQE REGRESSION PReQRAM»»#«»»## . 

©NE MAIM • QNE SUBRQUTINE • MAXIMUM MATR I 0 IMENSI BNS ARE IN THE 
MAIN, VARIABILITY IN THE DIMeNSIQNS IS ACC8MPLISHED THRQUOH THE 
SUBRBUTINE CAULf SUBROUTINE OdNt, DOES ALL THE WBRKr SUBRBUTINEt 
USED FOR MATRIX OPERATIONS AND EIGENVALUES ARE CALLED FROM THE 
USER LIBRARY. THESE ROUTINES ARE PART OF THE XEROX NUMERICAL ' 
SUBROUTINE PACKAGE. 



FUNCTION « 
COMPUTES: 
1 

3 

5 
6 
7 

8 



TO DO' A RIDGE REQRESSIO^i ANALYSIS ON ;a GIVEN STUDY. 



MEAN VALUE FQR EACH COLU^^N IN MATRIX X 
MEAV VALUE Pap EACH CBLUMNi iNj MATRIX Y 
STANDARD DEVIATION Ft3R EACH C?^LUMN JN MATRIX 
STANDARD QEVIAT19N FOR EACH C9LUMN IN MATRIX 
TABLE OF INTERCORRELATIONS 

EIGeN VALUES FOR TABLE OF INTERCORRELATIONS 
DETERJ^INANTS f^'OR TABLE OF INTERCORRELATIONS 



LEAST SQUARES AND 
INTERC3RRELATT9N3 



RIDGE COEFFICIENTS FQR TABLE 9r 



THE RIDQF REGRESSION PReQRAM CAN HANDLE TWO KINDS OF DATA F8R THt 
TABLE 9F INTERCORRELATIONS: 

1 RAW DATA T'-'E X ANJ Y MATRICES AKe READ IN AND THE TABLE 

INTERCPRRELATieNS (MATRIX R) IS COMPUTER 
•2 MATRIX R IS READ IN STRAIGHT FROM CARDS THE TABLE OF 
INTERCORRELATIONS ALREADY COMPUTEp. 

IF OESIRpDi CEf?TAIN CeLUM.jS READ INT9 ^ATRIX X CAN BE SINGLED 
OUT OF THE COMPUTATION FOR THE TABLE OF INTERCORRELATIONS. THIS IS 
ONLY AVAILABLE IF THE DATA FOR THE TABLE OF INTERCORRELATIONS IS TO 
BE COMPUTED WITHIN THE PROGRAM, 



VARIABLES 
X 



INDEPENDENT VARIABLE IN RiJNCTION 
Y • DEPENDENT VARIABLE 
R - TABLE OF INTERCORRELATIONS (X) 
RL • EIQE^3 VALUES 
DETER - DETERMINANTS 
Z • TABLE OF INTERCORRELATIONS (Y) 

STANDARD DEVIATION'S FOR Y 
G - STANDARD DEVIATION FOR X 
MpAN • MEAN VALUE (USED FOR BOTH X AND 'Y-J 
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48 « 

49 < 
S0« 
51 ( 
S&. 
63. 
64 • 
55 • 
56 1 
57. 
68 1 
59. 
60i 
61' 
62. 
63. 
64. 
65. 
66. 
67. 

tl: 

70. 
71. 
72. 
73. 
74i 
75. 
76. 
77« 
78. 
79« 
80. 
81. 
8S« 
83« 
80. 
85< 
86. 
87. 
88. 
89. 



C 
C 
C. 
C 
C 
C 
C 

c 
c 

c* 
c 



c 
c 
c 



K9 • ERROR FACT9RS 

N . NUMBER 8F RPWS IN MATRIX X (8R NUMBER BF DATA SETS) 
IX - \'WM8eR BF CQlUMNS jB REMAIN IN X MATRIX^FBR C8MPUTAT|8N 

IN TABLE BF INTERCBRRELATISNS, 
IS • ACtUAL NUMBER BF CeLUMNS READ JN FOR X >1ATRIX 
lY • ACTUAL SIZE OF Y nATRIX, (NUMBER BF ACTUAL COLUMNI.aMAXi] 
IN • NUMBER BF'iCf' 8 IB BE INTR90UCEO(REAO IN) 



DIMENSIBN 

OI'^EN'SIBM 
DIMEKSri-BM 
DiMENSrOM 
DIMENSIBN 
OlMEvSlflN 
P) I MEMS I a"*' 
D MENS 1 AN 
DIMENSIBN 
DIMENSIBN 
DIMENSION 
DI^F^'SIBN 
OI^'E'^iSIflN 
OIMENSIBN 
DIMENSIBN 
DIMENSIBN 
DIMENSIBN 
DIMEVSIP^ 
DIMEMSIBN 
Ol^Ek SIBN 
DIMENSIBN 



A(f5i60) 

i?X(15j3) 

B(l?5i3) 

BB(15i3) 

C(15i3) 

D(15il5) 

E(3) 

F(3il5) 

FB(15M5) 

FF(15M5) 

Q(15il) 

P(lil5) 

(;(li3) 

RL(15) 

RLL(15) 

RM(3,3) 

T(15i3) 

U(15il) 

X(60ilb) 

Y(60i3) 

2(15*3) 



VARIABLES IX AND IV DETERMINE DIMENSIBN WITMJN SUBRBUTINE 



REAO (105*961) 
CALL DBNT 
•Ni INiBXi IS) 
961 FBRMaT (5(12i2X)) 
END 



N» IX, iSi lY, IN 

( AiRiBb/CiDiEiFiFqiFFiCwPiQiHiRMiTiUiWiXiYiZi IX> lY 



It 
2f 
3. 
4t 
5* 
6* 
7* 
8t 
9f 
10* 
11* 
12* 
13* 
14* 



SUBROUTINE 
•Ni 1N,BXj IS) 
DIMe:ms19N A(IXiN) 
DIMENSIBN B(IXilY) 
DIMEf^SIBN BBdXXlY) 
DIMENSIBN BX(IXilY) 
DIMENSIBN CdXilY) 
DIMENSIBN O(lXilX) 
DIMENSIBN DETER (15°) 
DIMENSIBN E(IY) 
DIMENSIBN FJIYilX) 
DIMENSIBN FB(IXilX) 
DIMENSIBN FF(IXilX) 
DIMENSIBN Q(lXil) • 



( A,BiBBiC,D,EiF,FBiFFiQ,P,aiR,RM,TiUiWiXiY,Zi IX* lYi 
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OIMENSIBM 
DIMENSIBN 
DIMENSIBN 
DlMEKfSlBN 
DIHEMSIBN 
OIMENSieN 
0IMENSI8N 
D1MEN81BN 
D1MENS4BN 

Mmensibn 

di^ensisn. 

DIMENSI9N 
DIMENSION 
DINENSIBN 
DIMENSIBN 
DIMENSIBN 
DIMEN^iev 
DIMENSIBN 
DIMENSIBN 
DIMENSION 
DIMENSIBN 
DIMENSIBV 



PdilX) 
Q(IjIY) 

RdX/ix) 

RM(IY#IY) 

T(IX*IY) 

U(IX/1) 

X(N,IX) 

W(IXjIX) 

YtNjIY) 

Z<r8*IY) 

BXX(15) 

RL(15) 

RkLdS) 

WKKIS) 

WKg(15) 

LAB(18) 

IT(2) 

ICH(15) 

S9{3) 

VAL(15) 

IFBRM<12) 

TEMP(15) 



OIMENSISM PElG(15t 
REAL K9 
REAL MEA"^ 
INTEGER ANS 
INTEGER ANSR 
INTEGER YES 

DATA YES /4H YES / 



DATA IT /<fHXXXX*>MVYYY/ 
DATA BLANK /i>,\\/ / . % 

DATA IFBRM /^MlX/A/^W^/lXi^H* (/i 



C 
C 

c 
c 
c 



c 
c 
c 

c 



r4HF6t3*4H*lX)*i»H* (*4HA4*3, 
• 4HX)i S^Wi •,3,4H(F6»i4H3,lX*4H) ) / , 

lANS ••EITHER A YES BR A NB. THIS DETERMINES WHETHER THE TABLE tF 
INTERCBRRELATIBNS'IS TB BE CBMPUTEO WITHIN THE PRBQRAM. 
IF YES* QB TB llll. 

READI105/957) lANS 
IF<IANS»E0.YES) QB TO mi 
DO fiSO Iffl/lX 

D» 880 J»1/IY 

88rf. PEAD( 10B#9;3rO) Z(IiJ) 

D9 8«»0 IX 
890 READ{ 105/^50) P| I i ) 
830 CQNTINUE 

Ge TO 9985 



THIS SECTIBN LETS PJLL 9ijT rE^^TAIM C^L-J'^^'S IN X MATRIX FBR 

C9MPUTIMCv TWE TABLE t^F P TERCdRRrLATI9NS 



1111 NUM-IS-IX 

IF(NJUMtEQtO) NUM«l 

READ (105/965) (ICH ( J ) / J« 1/NUM ) 

ne Bo I■^i^' 

REAC? (1CF/900) (TEMP( j)ij.ii IS) 
READ (1C5#95D) (Y ( I #<) *K«li lY ) 
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75. iXmi ^^, ^ 

76t, . DB kS J«1#IS . - 

77t D© i»0 KPliNUM" 

78t IF(J,EQtICH{k) ) 3e TP if'^ - * 

*79. ifO CeMTINUE , 

80t X(I#!X)«TEMP(j) 

Bit IX«IX*i , 

82« 4S continue: ^ 

83t C ^ 

84t C COMPUTES MEA^ VALUES AND STA^'DARD DEVIATIBN FOR EACH COLUMN IN THE 
" 85t C X A^D Y MATRICES 

86t C - 

87* IX-IX-1 ' ^ ' 

88t 00 30 JililX 

89» 30 P(iiJ)iP(liJ)+X(IiJ) \_ 

90» . • DO 31 K»liIY, . . ' ' 

Se 91? E(»^)«E(K)+ (Y(I#K))#»? ' ^ 

92» 31 Q(liK)»Q(liK)+Y(IiJ^) c: - V 

93t WRITE (108i902) ITJDj (X ( I # K ) #K«1# I xO 

9*t WRITE (108i902) IT{2)i ( Y ( I #K ) # K.J/ I Y ) 

95t 50 CONTINUE " ^ 

/ 96t CALL S004 (Xi^iNjlx) 

97t CALL S003 { A/ Xi I X, I X ) 

'98t CALL'So03 (AiYiTilXi^, IY) 

99t WRITE (108i954) 

loot WRITE (108*964) IT{1) 

' lOlt DB331 J»liIX ^ . 

102t MEANJ«P( li J)/N 

«103t 331 WRITEU108i950) ^^EA'nj 

I'd^t WRITE C1P8#901) ; ' 

. 105t WRITE (108#964) IT{2) 

106t DO 3P J»liIY-* 

ia7t-t^ MEAN«Q(li J)/N 
108t ^ ^32 WRITE (I08i950^ "^EA^ 

- 109t WRITE (108*901) 

llOt WRITE {108#963)> IT(1) 

lilt DO 60 I»1#IX 

1 1 2 1 Q ( I # 1 V • SORT ( ( w { 1 i I ) n { P { 1 , I ) ##2 ) ) /N ) 

113t WRITE (108#950) 3(Iil) 

114t 60 CONTINUE 

ll5t VRITE (lr)«i901) 

116t WRITE (I08i963) IT(?) 

117t DO 70 l»liIY ^ ^ 

En),8QRT( (E(I)»<Q(liI)«#2)/N)/N) 

119t WRITE (108/950) E(I) . , 
120t ^ 70 CONTINUE 
121. C 

C THE TABLE OF INTERC0RRELATI0^^3• R FOR X MATRIX, Z FOr Y MATRIXi 

18*« D8 100 Ip1*IX 

125. D8 80 J«l#IY 

126« 80 Z(I*J)«(T(I*J)-P{l,I)#Qfl*J)/N)/N/Q{I*l)/E{J) 

127» D9 90 K«1*IX 

J!5* .'2 ^lI''<>''W(I#.K>-P(i,I)«P(i,K)/N)/N/Q(Iil)/Q(K,l) 

12S» 100 CeNTINUE , 

130« 9988 ceNTINUE 

I3i» WRITE (108*954) 

132» WRITE (108*969) 

133» WRITE (108*954) 

13»« READ(i05*952) (L|^ (<>,K»l* 18 ) 
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WRITE(106«903) (LAB (K WK»1# Ig) 
IHiig«IX 

D8 111 I.l/lX , ^ ^ 

iF(lMtEQtO) WRITE (108/909) LAB 1 1 ) # ( R ( N K ) # K«li 1 X ) / 
« (Z(I#K)#K«1#IY)I QSTQ 111 
ENCeoE ( 968,# -1 FORM ( 3 ) j JDUM > * 1 X 
ENC6De(4#968#IF5RM(6)MDUM)#lH 

WRITE (l08/IF9RM)LA8(I)i (R( I#K)iK»l# 1X)# (BLANK, lH)s 
« rZ(IiK)/K«lilY) ' . ^ 

111 CONTINUE. 

WRITE (I0»i901) 

•C ' • . . , — ■ " 

C EIQEN VALUES AND DETERMINANTS FOR TV<E TABLE 5F INTERCMREL'ATI9NB# 

C - . , 

CALL CHA^QESS (R/FF/IX/Oil) 
eil2 CALL E^IQEN (FF#FB,IXM) 
8113 CALL CHANGESS(FF#RL#IX/l#2') 
RLL(1)«1/RL(I) 
DETER(1)«RL(1) 
. D? 113 I»2#IX 
DETEP( I) ■DETER( I-l )«RU( I) 
. 113 RLL(l)«RLL(I-l)i^l/RL(I) 
Od 3113 I«1MX 
PElGT»PEIQT*RL(r) 
PEIQ(r)iiPEIGT/IX 

3113 C9NTINUE 

WRITE, (108/910) 
00 ll<f J«li IX 
114 WRITE (108^911) 
WRITE (108/954) 
DO 7115 KK»l#IX 
IF(RL(KK)iLTrG) 39 T9 999 



RL(J)jRLL( J)iPEIG(J)iDETER(ji 



C 
C 

c 
c 
c 
c 
c 
c 
c 



7115 

ePTI9N TR PAUSE AFTER EACH BETA PRINT 
READ. (105/957) ANS 

RIDGE COEFFICIENTS FOR THE TABLE 9F INTERC9RRELATI9NSf. 
♦ ^PREVIOUS' ly PRINT OUT « MATRIX B& 
♦'CURRENT' IN PRINT OUT ■ MATRIX B 
« 'CURRENT Bt IN PRIMT OUT ■ MATRIX BX 
« TOTAL. UNDER iCURRENf 8' ■ BXX 



1?9 



130 



00 "200 l.li IN 
00 i;?9 J.lilY 

BXX( J) ».0» 
WRITP ilC^iSOV) 
READ (10Bi951) 
WRITE (108/904) ^9 
DO 130 J«liIX 
R(JiJ)»l+K9 

CALL MC19 (R>0* iXi IX/O) 

CALL iNi^F^TC?* IX iPETiV^i^i^w^p) 

CALL S003 (DiZiBi IXi IXi lY) 

IF( I ,QTt3 ) Cie TO 140 

CALL M019 (8iC# IXi lYiO) 

CALL SC04 (CiFi iXi TV) 

CALL S003 (UiOi^ilX^IXilX) 

CALL SCC3 (w,CiT# IX^ IXi IV ) 

CALL SC03 (FiTiRMi.iYi IXi lY) 

WRITE (108/906) 
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I95i D9 141 Kp1/IY 

196« Dd 141 J»li IX ^ - 

197« BX(J/Ky-B(J/K)*(E(l<r)/Q(J,i ) ) 

198« BXTEmP-(B (J/K))««? 

199« 141 BXX(K)iBXX(K)+BXTEMp 

apOt DO 150 J"4jIX 

201i 150 WRITE (1O8/905) (BB( J/K)#B( J/K),8X(J/K)/K»1/ lY) 

202» - WRITE (108/966) ( BXX ( K ) *<"1 # lY ) 

203. WRITE (108/901) ^< - 

204. C 

205. « IF(AMS.NE.yES) 36 T8 155 
206« WRITE (102/958) • 
207* READ (102/953) ANSR 
2C8« ' IF(AN'SR.VJE.YES) Q9 19*999 
2C9« C 

210* C LEAST SQUARES F9R THp TABLE BF IMTERCQPRELATlONSt 

2Hi C ♦ »R SQUARED' ■ H 

2ia« C ♦ »ERR8R SQUARED » ■ S 

213i C # » VARIANCE' n VAR 

214. C ♦ »BIAS <5QUAPED» ■ . 

S15« C ♦ 'PIOGE » ■ RIDGE ^ 

2l6t C 

217t 155 De 170 J-l/lY 

218t ^ HmQ. * 

219* V1»0. ^ 

220« \ D9 K-I/IX 

221* H«W+R(<, J) #Z(K, J) 

2g2t Vl"Vl+RL(<)/( (RL(<)+K9)**2) , 

223i 160 CONTINUE 

224t IF(NEQ.l) S9(J)-i.H 

.225t 165 S-l-H 

226* , VAR«S9(J)#vl 

227« B9»(K9»*?)*R^(J/J) 

228« RIDGE"(S9( J)^>V1+B9) 

229« WRITE (lrt8/908) J/H' S, VAR/ ( VAR/S9 ( J ) ) /89/1[ B9/S9( J ) ) /RlDQEj ^ 

230« , ♦ (RIDQE/S9(J) ) ' ^ 

231t 170 C6NT1NUE ' - 

232. WRITE (lrS/954) 

233. ' CALL MOIQ ( B/RB/ I X/ I Y/'o ) 

234. 200 CONTINUE ^ . 

235. c . ' , y 

236. 900 FORMAT (2(5(Fl0.4/2X)//)/5iFi0.4/2X) ) 

237. ■ 901 FORMAT (5(1H ) ) 

238. 902 FORMAT (l^X/'MATRIX » * Al/// 3 ( 5 ( IX/F i2«6 ) / / ) ) 

239. 903 FORMAT (43X/»X«X INTERCORRELATIONS i,47X/ iX-Y IfShigRCORRELATIONS I, 
2^0* ? ///6X/15(A4/3X)/1X/3{A4/3X) ) 
2<H. 904 FORMAT (1X/»K VALUE* »/F8i4*///) 

242. 905 FORMAT (3(2(F12.3))) 

243. 906 FORMAT (IX, 'BETA COEFFICIENTS AREt »///lX/ 

244. . 2 3( »PREVI0uS»/4X/ »CURRENT»/5X/ 'CURRENT B»/7X)) 

245. 908 FORMAT (iX/ »FflR DEPE^'DENT VARIABLE » / 1 2/ i / R SQUARED IS »/F7.3/// 

246. 1 IX/ ' ERR5R SQUARED ■ » Fl0i4/// IX/ » VARIANCE • »/Fl0.4*gX#Fl0f 

247. ^ 2 //IX/ »BIAS"S0UARED ■ » *F10«4/ 2X/F 10«4/// IX/ • RIDQE n »/Fl0i4/ 

248. 3 2X/F10.4///) «3 r • 

249. 909 FORMAT ( 1 X/ A4/ I X/ 1 5 ( F6 . 3/ 1 X ) / M ' / 3 (F6 . 3^ iX ) ) 

250. ' 910 FORMAT dX/'EIGE^^' VALUES FOR MATRIX Rj • / // 12X/ • E I GEN VALUES 

251. 2 05X/ 'RECIPROCAL SUM • /5X/ » PROPORTI ON E IQEN »/ 3X/ t DETERMINANT »/// ) 

252. 911 FORMAT (9X/15( El2*4/3X/Fl2i4/9X/Fl2.4/9X/Fl2.4//) ) 

253. 950 FORMAT (Fl2i6) 

254. 951, F0RMAT(FiO.4) 

tt 

• ' • 68 



79 



255* 
256. 
257. 
258* 
259* 
260* 
261* 
262* 
263* 
26if 
265* 
266t 
267. 
268. 
269. 
270. 
271. 
272. 



953 
952 
954 
957 
958 
960 
962 



7 



(15(A<f'>lX),/i3(A4<iy)) ' 
(IMI) 
(<f9XiA4) 

(IX* »X9NTINUE ' ) 

(BX* 'EIQENVECT9RS F9R MATRIX R| «</) 
(49Xil2) 

■( iXi-'STANOARD DEVIATI9\ F'»R EACH CBLUMN 
(IXi'nEAN VALUE FBR EACH C9LUMN -- MATRIX 
(15(2X* 12) ) 
(/*3(l2X<Fl2.6M2Xn 
(t*»#l2*'{') 

(30X*'RlDQE. REQRESS19N ANALYSIS'*/* 
230X* IDETERMIMAMT',/, 
330XiiTABLE 9F INTERC«RRELATI9NS i ii/> 
430X* tEIQENVALUES»,/, 

530X*tLEAST SQUARES AND RIDQE C8EFFICIENTS»////) 
999 END 



reRMAT 
FORMAT 
F9RMAT 
FORMAT 
F9RMAT 
963 FBSmat"" 
96'f F0RMAT 
FORMAT 
FORMAT 
FORMAT 
F9RMAT 



965 
966 
968 

969 



MATRIX^ 
'*A1) r 



'<Al) 



(Questions regarding this program should be refer rec^ to the Real Time 
Simulation Section, Displays and Human Factors Department, Hughes. Aircraft 
Company, Culver City, California 90230. ) 



DATA SET- UP 



FIRST CARD: 



Col. 1-2 

Col. 5-6 

Col, g-icjj 

CoL 13-14 

CoL 17-18 



SECOND CARD: 
CoL 50-52 

or 

Coi. 50-51 




Number of observations (N) 

Number of predictor (X) variables to be Analyzed { 
(maximum = 15) (see THIRD CARD, below) 

Nurnber of predictor (X) variables in the total data set 
(maximum =15) 

Number of depiendent (Y) variables (maximum = 3) 

Numhjer of k values [k' s are the constants used to bias the 
diagonal^of the correlation matrix and is referred to as 
"k9-error factors" in the program.] (see SEVENTH CARD 
below) 



If you are starting the analysis by inputting the raw data v 
values of the predictors and associated performance, 
write YES 

otherwise ' ^ 

If you are going to start by inputting a previously calculated 
correlation matrix, write NO. 
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T^HI^D.CARD: 'l!t^ll of the predictor (X) variables indifcated in Col. 9-10 
• ' . . of the first card are to be included in the analysis, leave 
' • this card BLANK . ' ' - 

\f ' . some of the predictor variables are not to be included in 

" ./ ■ ■\ ' the analysis, then these must be identified by their identifi- 
r. cation number, i. e,^, the number of their position on the 

DATA INPUT cards below (which also could be used as the 
Isabel ID required by the FOURTH card). ' . y 

/ ' ' The, predictor variables to be Qxoluded from the analysis 

/ - are' entered: 

Cols, 1-2 ' BLANK . ! ^ ' 

Col's. 3-4- ID number of first predictor to 'be excluded, 

Cdis, 5-6 BLANK ^ a ^ ' 

Cols, 7-8- (After two blank spaces, the next two spaces are used to 
^ enter the ID ntimbers, 01 to 15, of each variable to be 

omitted from the analysis,^^until allure indicated,J 

DATA INPUT CARDS -- N^SETS -- IfOLLOW AT THIS POINT: 

For each oljservatidn, °a set of data input ca-rds for the values of the 
predictors and the performance is required. The order in^ which the 
variables are listed on the cards is fixed and their position can be used 
as their identificatdon number (see THIRD and FO-URTH cards). 

For predictor (X) variables, a maximum of 'three cards 
can be used with'five inputs 'on .each card in a decimal- 
^ number format. Ten columns per .input, with four 
decimal places, right-justified. There are two ^spaces* 
between each input. 

For dependent (Y) variables, ohe per card, maximum of - 
.three cards.. Twelve columns per input, ^with six decimal 
•places, right-justified, Y variable calrds follow each 
corresponding set of X variable cards, ^ ^ 



FOURTH CARD: 

Col. 1-4 
Col. 6-9 
Col. 11-14 

^FIFTH CARD:^ 

Col. 1-4 
Cbl.^6-9 
Col 11-14 



SIXTH CARD: 
CoL 50-51* 



or 



CoL. 50-52 



S5:VENTH 
CARD (SET) 

Col. 5-10 



Label identification of X predictor variables (maximum = 1 

on Table of Data Corrrelations 

' ' • - I 

Label ID for first" X variable, fright justified 

Label ID ior slecond X variable, right justifie^ 

(Continue^with four characters per ID and one space 
between until all are labelled. Leave remainder BLANK. ) 

Label identification of Y dependent variables (maximutn = 2 
on Table of intercorrelations. 

Label ID for first Y variable, right justified 

Label ID for second Y variable, right justified 

Label ID for ^hird Y variable, right justified ' . . 

(if fewer than three "Y variables, leave extra columns 
BLANK. ) . . 

Do you want the printer to pause after each beta printout? 

If you do not wish the printer to pause after pointing the 
beta coefficients for each k, write NO, In this case, it' 
will analyze and print out betas for all k values indicated 
on SEVENTH card. " . 

Otherwise 

» <j 

If you do wish the printer to pause after printing the beta 
coefficients for each kdn order to inspect the vafues and 
possibly decide to abort the program from that point on, 
write YES. 




Use one card of this s^et for each k factor to be added to 
the matrix, (Maximum = - 14) 

Use" decimal-number format, left justifie^i^, beginning with 
0', then the decimal, and then the numbers of th^^, e. g, , ' 
0, 0 • 
0.022^ . 
0. 06 " • . ' ^ 

0. 1 ' . • : ; ^ 

0. 5 > 



EIGHTH CARD: 

C61. 1-4 RFIN 





APPENDIX D 

SAMPL:^ PRINT-OUT OF RIDGE REGRESSION PROGRAM 




.These are sample print-outs of critical information in the ridge regression 
program listed in Appendix C. Included are: raw score data matrix, means 
and standard deviations of all variables; correlation matrix, eigenvalues, 
sum of eigenvalue reciprocals, cumulative proportion accounted for by eigen- • 
values, determinant of the matrix, and for each value of k: R2, error 
squared,. nl)Tmalized variance, normalized bias squared, normalized ridge, ^ 
and ridge coefficients for standardized and raw score measures. 



RUN 

•OOOOCO 
itOOOOOO 
StOOQOOO 

MATRIX Y 

2»7C136d 
MATRIX X 

•ocoopo . 
itoooooq 

2«000000 ' 

MATRIX Y. 

3t33577C 

MATRIX X 

•OOCOOO 
It 000000 
2tOOOC00 

MATRIX Y 

' 3t350l50 
MATRIX X 

•ooocoo 



50'OC^OOO , ItOOOOOO. 
1«000000 If 000000 

1»000000 ?500«000000 



SOtOOftOCO 
1 'COOOOO 
1»00000C 



5o»oonooo 

1 •OOOOCO 
1 "OOOCOO 



1 'Oooono 

1 •000000 
2500»OQOOOO| 



s^ooocoo 
1 •ooocoo 

l^OOOOOO 



5»00000C 
3 '000000 



8»000000 
•000000 
5»^00000 



SiOOOOOO 
•000000 



Partiab print out of riaw score data for 
rfifteen predictor tX) variables and 
one ^response (Y) variable. 



1 'OOOOCOl 

I •OOCOOO 



4»000000 •oooooc 
1 •ooocoo' ?o»oooooo 




MEAN VALDr FaW^E-ACH CMJMk 

•62264? 
- 31 •509^30 ^ 

2^849056 
6 • 2547 17 

•863203 / 

•207547 

•226415, 
'^'•OOOOCO 

•113208 
1*867924 

•754717 
1426.41503'9 
• 88.''452R23 
5»622641 



^AT^jy X 



Means and standard deviations for 
^ each predictor (X) and response (Y) 
variable. • ' " ' 



MEAN) VALUE F9R-EACH C^LUM^ M-aTJ?I^X Y 



2^963115 



STANDARD. DEVI ATt^\ F^o EACH . COLUMN' 
•^«47?6- 
20^8e2357 
8^956?96 



0 ^ 



ERIC 



i . 



73 

83. 



s 



ae 



111 
»- 

2- 



CO 

Z 

Q 
< 

U 

<: 
u 

w 
p 

<: 



w 
o 

W 



<: 

X 

2 



2; 
o 

u 



u 

or 



o 



«o^rv ♦ m oi (T) tf) •-•>o oooinooi 
w4mo G ^ m w4 o o (vj oin «^ o o o 

• 1119 

cn o 00 *-i ^ G\ m « in go r^ in o 



ajinr*>>«4C\ja^oK om in -h o ^ 
>-« O CTv ooo OOOOOOJ ooo 



oin cn cr --•mo^mmifi--'(rt o--«rv 
tnr*^«4Kmm>o-4'«-rc\jif)in o -h in 
orvtn o r*} «-« (VJ o o (VJ m (V m oom 



(\J4-fnr>o«r-i^(Vjaj^aj-HO(^(vjo 



D « «4 rv o « m •<) en ^ O -4- «^ f*i 
«*cr^ ^^c^-*»o«(vjo*inin o 



00 a\ in <r m QO (7> oo-H^tn^otn 
rv « « (Ti o o (v o a. in ^ 
ocvm o(vj(^ininin oo^ojmo^ 

I I I I I «^ ''ill 

u> o « c\ru r-inoccr -nmoao 
irMn o^ir-Hr-mooot in o (vj 
cr. m (\j o o -H a -h (\^ c 1r-» m«— (v; o *c 
o****««*« ••••••• 

ft t I » , ^ " 

t x ^ 

4-«(\j-Hd>^.^ o in m <o omr»-to 

co^o«r-iai«r-i(vjr^O(vjir>--«cvjoo-*- 
o 



tit 



I I 



I I 



r«. o o o -H o (VJ or«. -Hip^w'Toon 



OQocvj(\joc>ir<^ -H^ff'cr ^mir>oS 
«o(vj(vj --•-*.^o (vjcviOiin o -H (VJ o -H 



(\j ii>am cr c c» t*" o I^ >c ^t- t ^ m (\j -h 
ir^ (u -H r -H o -H ^ o ^ ^ r> o 




-H t I t I 



I • 



■♦^-^ o o (xj "S-i (VJ orvja. om o ^ 



(Xjoo-^r^ ^oocvjoin— i>£»m-HO. 
^o-♦o<\J^"l<\Jouoo^r«'^^ (V! 
1 ' m ^ o o o o "H 0"-«oofjiji OCT o 



w oc o o w a o o oj fv rn <r lo ♦ 



< 


































^» 


m 




CO 


JO 




in 






00 




o 




(VJ 




a 


m 


IT 


a 






m 


r. 




m 


a 


m 


in 








or 




r«. 


-* 




IT 




r». 










• . •* 




cC 




a 


(VJ 


ir> 


c 




IT 


o 








00 




OJ 






o 


































u 










(Vj 


m 


in 




(^ 


m 


<r 




O 


00 


in 


u 
o 




























o 


00 


JC . 




• 




























































»- Li 




























-l 




cr r> 
































»- -J 


(VJ 




00 


OJ 


a 


m 




4r 


in 












< < 


oo . 


fV 


o 




m 






c> 


%c 






dV 






y > 


o 










cv 




C' 


or 


<* 






(\j 


*(VJ 








c 






nj 


<r 




IP 


cr 


SAJ 






o 




o 


































a: Ui 




(*• 


(Vj 


























U IT 

































































om <v 00 DO in 3P r^-cs*" o 

•-40GO-H<oajfuowfy>(vj(vj 4^ro 
tttttt tti 



(VJ (Ti 
tVJ 

-H IT 




74 



. .it) 



w 
w 

r 

< 
o 

Q 
Pi 

• 

O 
' II 

W 

P 

a 
w 

CO 

cpq 
P 

CO 

Q 
< 

03 
0) 

cr 

CO 
0} 
(U 

i-H 

T3 
0) 

cn 

• iH 

-a 

o 
II 

< 
0 

2 



.r> ir c »c o -t -o rv ^ (Vj d r» c 
T IT r.. ( • o cv a — t 4- r~ cv» f 
f- III I i' 













ff 








> 


( ' 




c 




c • 


. Or 




u,* ~ 


• 






o 




iT 




u 


* 






ir 


"> 




-i 




<<* 


< > 


•> 










tr n 



r 4" a* a -I n* >i T a* ^> 
j» m Ci r» nc a, *c 



rv I*) 0» rr ■* tr. ir «- c*: 



u 

or 
or 



t • 0 I 







— CO 


A. 




-* « 


in 






c 


f 




■* 


« cv 


c 


^ r» 






C If 












c 


^ C IV 




A. 


ir 


c 




or 












m 




• 










• 


• 


• 


1 


• ft 


• 






















c 




















c 


»- or 


















c 


2* or 


















o 


UiD 


















• 


•-• u 




















u 


c c 


o c 


c 


c c c 


c 


C 


c c 


c 




u. 


c c 


c c 


c 


c c c 


c 


c 


c c 


c 




u 


c c 


o c 


c 


o c c 


c 


c 


c c 


c 


■ 




















lAi 


e D 




















(je 


















-1 




















< 


< > 


















> 










































• ic a 




















(/) Ui 3 

o. , o q;f W 
or 

o (C < Ul 
or cn o 
or (K Ql! < 

U 



ERIC 



75 



85 



■ • • / • 

APPENDIX E " 

OTHER APPLICATIONS IN THE DESIGN OF EXPERIMENTS FOR WHICH 
THE TECHNIQUES DESCRIBED IN THIS REPORT MIGHT BE USED 

Hebbie and Mitchell (1972) illustrate how the njaximum Var (^) or 
maximum | X'X| criterion can be used for other important purposes, such as 

adding data points to: 

i • ^ 

1. Expand a square region of interest in a second-order model. ^ 

Z. Alter the model to fit the space, ' . 

3. Shift the region of intere^st." 

In these situations they employ candidate points much in the manner proposed^ 
by Dykstra U971). 

Mitchell (1974) uses the maximized |X'X| criterion (with a specified 
linear model and a value of n) to: 

1. Exchatnge data points to improve a design. 

2, Determine whether more data'points might improve the design. 

3. Select a best design made up of a subset of candidate points when 
limits are placed on the value of n. • 

4, Supplement "screening" designs (see Simon, 1973) to isolate 
l^o-factor interactions. ^ . 
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